New Optimal Weight Combination Model for Forecasting Precipitation

In order to overcome the inaccuracy of the forecast of a single model, a new optimal weight combination model is established to increase accuracies in precipitation forecasting, in which three forecast submodels based on rank set pair analysis R-SPA model, radical basis function RBF model and autoregressive model AR and one weight optimization model based on improved real-code genetic algorithm IRGA are introduced. The new model for forecasting precipitation time series is tested using the annual precipitation data of Beijing, China, from 1978 to 2008. Results indicate the optimal weights were obtained by using genetic algorithm in the new optimal weight combination model. Compared with the results of R-SPA, RBF, and ARmodels, the newmodel can improve the forecast accuracy of precipitation in terms of the error sum of squares. The amount of improved precision is 22.6%, 47.4%, 40.6%, respectively. This new forecast method is an extension to the combination prediction method.


Introduction
Precipitation time series forecast has received tremendous attention in the world because of the uncertainty of climate change which increases the difficulty of accurately forecasting such time series.The forecast of the nonlinear and uncertain time series is very difficult with the traditional deterministic mathematic models, which cause new challenges to increase forecast accuracies 1, 2 .There are many methods for predicting complex time series 3-13 .Rank set pair analysis R-SPA model is based on the principle of set pair analysis, and, in this model, we take rank as the particular characteristic of the time series which could be regarded as the standard of the similarity analysis.Radical basis function RBF neural

The Optimal Weight Combination Model
In this paper, the procedure of establishing the new optimal weight combination model can be divided into three steps as follows.
1 Construct the weight combination model.
3 Calculate the weights of the three submodels by using IRGA.
The flow chart of this procedure is shown in Figure 1.

Construction of the Weight Combination Model
In the case of N forecast models, the weight combination forecast model 20 may be expressed as where x i is the observed discharge of the ith time period, w j is the weight assigned to the jth model, and estimated discharge x ji and e i are the combination error term.

History set Elements Subsequent value
In the weight combination forecasting model, the sum of the weights w is normally constrained to be equal to unity, that is The value of the weight w cannot be less than zero, that is

Rank Set Pair Analysis (R-SPA) Model
The procedure of the establishment of this model is shown as follows.
1 Consider an annual precipitation series x 1 , x 2 , . . ., x n , we constructed the history sets A 1 , A 2 , . . .A n−T , current set B and the subsequent value of these sets are represented in Table 1.
Because of the weak dependence in the annual precipitation series, we assume that the number of history set and current set T to be an integer from 4 to 6.
2 Rank transformation.We mark the elements in A 1 , A 2 , . . ., A n−T , B from 1 to T according to the rank of elements in the sets they belong to.If some elements have the same rank, we mark them according to their average rank and round off the value.Then, we could obtain the rank set A 1 , A 2 , . . ., A n−T , B .
3 Construct n−T rank set pairs A i , B i 1, 2, . . ., n−T and calculate the difference d between the corresponding elements of A i and B .If the absolute value of d is equal to zero, we mark them "identical"; if the absolute value of d is greater than T − 2, we mark them "contrary"; if the absolute value of d is between zero and T − 2, we mark them "discrepant."Respectively, count the total number of "identical," "contrary," and "discrepant" of each rank set pair.According to the value of the coefficient of the discrepancy degree i and the coefficient of the contrary degree j, the connection degree formula as follows: where μ is the connection degree of the set pair, N denotes the total number of characteristics of the set pair, S represents the number of identity characteristics, P is the number of contrary characteristics, F is the number of the characteristics if the set pair is neither identity nor contrary.According to 2.6 , we calculate the value of the connection degree of each rank set pair. 4 In accordance with the maximum principle, we can find a similar set A i of B, and also we can find several similar sets of B under certain circumstances.A i is the counterpart of A i , and the subsequent value of A i is x T 1 .We can obtain the value of x n 1 through the formula as follows: where w k is the ratio of the average of the elements in B and the average of the elements in A k , m is the number of the similar sets of B.

Radical Basis Function (RBF) Model
The interpretation of the RBF method as an artificial neural network consists of three layers: one layer is the input layer neurons feeding the feature vectors into the network; another layer is a hidden layer of RBF neurons calculating the outcome of the bas functions; the last layer is the output layer neurons calculating a linear combination of the basis functions 21, 22 .
The different numbers of hidden layer neurons and spread constant are tried in the study.Its topological structure is shown in Figure 2.
The procedure of the establishment of this model is shown as follows.
1 Normalization of the time series.Consider an annual precipitation series {x x 2 x n y 1 y m φ i (x) neural networks established by the training sample, we forecast the value of the last n − N elements of the series {x 1 , x 2 , . . ., x n } and the forecasting series can be represented as {y N 1 , y N 2 , . . ., y n }.In this study, we take that the value of the mean-square error is 0.0001 and the width of the radical primary function is 1.
3 Denormalization of the forecasting series.Since the value of the elements in forecasting series is between zero and one, that is y j ∈ 0, 1 , we should denormalize the forecasting series {y N 1 , y N 2 , . . ., y n } to final forecasting {y N 1 , y N 2 , . . ., y n } through the denormalization formula as follows: 2.9

Autoregressive (AR) Model
In this paper, we regard the data of the annual precipitation as a time series and the trend term, seasonal term, and random term can be extracted from the time series in sequence.Then, we superpose the trend term, seasonal term and random term, and obtain the equation as follows 23-25 : where x t is the precipitation time series, A t is the trend term, B t is the seasonal term, and C t is the random term.

Mathematical Problems in Engineering 7
The procedure of establishing the autoregressive model is shown as follows.
1 The extraction of the trend term.In this paper, the data performs a clear quadratic algorithms component, so a polynomial function is used to fit the precipitation data.The trend term A t can be described as follows: A t P 2 t 2 P 1 t P 0 , 2.11 where P i i 0, 1, 2 is the coefficient of the quadratic polynomial 2.11 .
2 The extraction of the seasonal term.The analysis of precipitation seasonality can be accomplished with the aid of modeling via spectral analysis.The precipitation seasonality can be indicated with L waves.BB t is the output of P t subtract A t , and the estimated value of BB t can be defined as BB t : where L n/2 is the number of harmonicwave, a k and b k are the coefficient of the Fourier series 2.12 : BB t sin 2πki n .

2.13
Taking the working capacity into consideration, we choose the significant wave to forecast.And we define the kth wave as the significant wave when the following inequality is satisfied: where a is the level of significance a 5% ; s 2 is the variance of the series:

2.15
3 The extraction of the random term.The random term C t is defined as a linear combination of C t−1 , C t−2 , . . ., C t−p : where p is the order number of the model; α i i 0, 1, . . ., p denotes the coefficient of the regression model, which can be confirmed by AIC Akaike's Information Criterion formula: where n is the number of series, σ p 2 represents the variance of AR p and the appropriative of p can be chosen among 1, 2, 3, and 4.

Calculation of the Weights of the Submodels
The key of setting up the optimal weight combination model is to ascertain the weight of each forecasting model.In this study, we choose the weight which satisfies that the error sum of squares of the combination model is the minimum among all weight combination forecasting models, that is where f is the error sum of squares of the combination model.

2.20
And, the formula 2.13 can be represented as follows:

2.21
If we obtain the value of w j j 1, 2, . . ., N with the aid of formula 2.4 , 2.5 , and 2.21 , then we can ascertain optimal weight combination model.Genetic algorithm is an adaptive heuristic search algorithm premised on the evolutionary ideas of natural selection and genetic mutation, and it has always been regarded as a function optimizer 26-28 .The flow chart of genetic algorithm is shown in Figure 3.
In this paper, we use the improved real-code genetic algorithm IRGA to solve this optimization problem.The population size is 20; the crossover fraction is 0.8, and the generation is 100.

Application of the Optimal Weight Combination Model
In this study, the data of the annual precipitation from 1978 to 2008 for Beijing are collected and shown in Figure 4.  Firstly, we use R-SPA, RBF, and AR models to forecast the annual precipitation from 2004 to 2008 of Beijing, respectively.And the outputs of the three models are shown in Table 2.
Based on the forecasted data of the three submodels, the weights of the three submodels in the combination model are obtained by using IRGA 19 and the weights of the three models are 22.9%, 37.2%, and 39.9%, respectively, and are given in Table 3.
Based on the obtained weights, we calculate the forecasted data of optimal weight combination model, and the output is represented in Table 4.
By comparing the output of the combination model with the output of the three submodels, we find that the error sum of squares of the combination model is apparently lower than that obtained for any other submodel.In this study, the value of the error sum of squares is regarded as the standard for judging the precision of the forecast of the annual precipitation of Beijing, and the improvement of the precision of the new weight combination model compared with three submodels is shown in Table 5.So we conclude that the precision of the combination model is higher than that of three models in terms of the error sum of squares.

Conclusions
A new optimal weight combination model, based on the R-SPA, RBF, and AR models and one weight optimization model based on improved real-code genetic algorithm IRGA , is proposed in this paper.The annual precipitation time series of Beijing from 1978 to 2008 are studied by using the new model.The main conclusions are given as follows.
1 Three submodels, that is, R-SPA model, RBF model, and AR model, are tested to forecast the annual precipitation of Beijing, and the results suggest that R-SPA is better and RBF worst in the three models in terms of the error sum of squares.Different models have different precision for forecasting annual precipitation.
2 The optimal weights can be obtained by use of IRGA in new optimal weight combination model.Application results of the combination model indicate the weights of the submodels can be appropriately confirmed and such method provides a new way to improve the prediction precision for forecasting complex precipitation time series.

Figure 1 :
Figure 1: The flow chart of the procedure of establishing the new optimal weight combination model.

Figure 2 :
Figure 2: The topological structure of RBF.

Figure 3 :
Figure 3: The flow chart of genetic algorithm.

Figure 4 :
Figure 4: An annual precipitation from 1978 to 2008 for Beijing.

Table 1 :
History set A t and current set B.
1, x 2 , . .., x n }, we can transform the series to {x 1 , x 2 , . .., x n } by the normalization formula as follows.minand x max denote the minimum and the maximum of the time series {x 1 , x 2 , . .., x n }.2 Forecast of the data.The application of the RBF neural networks to time series data consists of two steps.The first step is the training of the neural networks.
Choose the first N value of the new series {x 1 , x 2 , . . ., x n } as the training sample, and set up the RBF neural networks.Once the training stage is completed, the RBF neural networks will be applied to the forecasting data.Based upon the RBF x 1

Table 2 :
The forecasted data of three submodels.

Table 3 :
The weights of the three submodels.

Table 4 :
The forecasted data of the combination model.

Table 5 :
The improvement of the precision of the new weight combination model.