High-Order Fuzzy Time Series Model Based on Generalized Fuzzy Logical Relationship

. In view of techniques for constructing high-order fuzzy time series models, there are three methods which are based on advanced algorithms, computational methods, and grouping the fuzzy logical relationships, respectively. The last kind model has been widely applied and researched for the reason that it is easy to be understood by the decision makers. To improve the fuzzy time series forecasting model, this paper presents a novel high-order fuzzy time series models denoted as GTS(M,N) on the basis of generalized fuzzy logical relationships. Firstly, the paper introduces some concepts of the generalized fuzzy logical relationship and an operation for combining the generalized relationships. Then, the proposed model is implemented in forecasting enrollments of the University of Alabama. As an example of in-depth research, the proposed approach is also applied to forecast the close price of Shanghai Stock Exchange Composite Index. Finally, the effects of the number of orders and hierarchies of fuzzy logical relationships on the forecasting results are discussed.


Introduction
In the last two decades, fuzzy time series approach [1][2][3] has been widely used for its power of dealing with imprecise knowledge variables in decision making.Many studies have been made to propose new methods or improve forecasting accuracy for fuzzy time series forecasting.For simplifying the computational process, Chen [4] improved Song's methods and presented a simplified forecasting model in 1996.Since the lengths of intervals greatly affect forecasting accuracy in fuzzy time series, Yu and many others [5][6][7][8][9][10] adjusted the lengths of intervals by the distribution or the optimization technique.In view of higher accuracy of forecasting results, the weighted models concerned with the various recurrences and on chronological order had also been improved [11][12][13][14][15].In addition, many models based on the conventional fuzzy time series were combined with novel algorithms or technologies.For example, Singh [16][17][18] proposed some methods to forecast the crop production based on computational method with different parameters.Lee et al. [19][20][21][22] presented several models based on the fuzzy time series, genetic algorithm, simulated annealing algorithm, and type-2 fuzzy set to forecast temperature and TAIFEX.Kuo [23][24][25] firstly introduced the particle swarm optimization (PSO) into the fuzzy time series models for forecasting TAIFEX.Song's [3] and Aladag's models [26,27] gained more accurate forecasts by employing artificial neural network to determine fuzzy relationships.
Although the first-order fuzzy time series models have a simple structure, they are easy to encounter trouble on explaining more complex relationships.And the firstorder models are not able to meet the demand of forecasting involved in multifactors or longterm time series.As compared with the alternative forecasting models, such as ARIMA, Hidden Markov, and ARCH models, there is still much room for higher forecasting accuracy in applying fuzzy time series models.For these reasons, Chen et al. [28][29][30][31][32] proposed some new methods which applied a high-order fuzzy time series model to forecast enrolments.Aladag et al. [9,26] introduced a high-order model based on feed-forward neural network.Lee et al. [20,33] also presented some highorder models based on two-factor and genetic-simulated annealing techniques.Most of time series researchers [18,22,[34][35][36][37] had showed their, respectively, interest in high-order fuzzy time series forecasting models.
In process of forecasting with fuzzy time series models, Fuzzy Logical Relationship (FLR) is one of the most critical factors that influence the forecasting accuracy.To obtain high forecasting accuracy, a lot of efforts have been put into mining the FLRs from fuzzy time series.In view of techniques for partitioning the universe of discourse and constructing the fuzzy logic relationships effectively, the above high-order models consist of three parts.The first one is mining the FLRs by applying some advanced algorithms or theories such as genetic algorithms, rough set, neural networks, type-2 fuzzy set, and simulated annealing algorithm [20,22,26,27,30,32,34,35]. The second one is the class represented by Singh [16][17][18] whose models are on the basis of computational method with difference parameters.The last but not least one is the kind of models based on grouping the FLRs represented by [9,28,29,31,33,36,37].In general, the first kind of hybrid models can get higher forecasting accuracy than the other two.However, the forecasting process of these algorithms is not easy to be understood.Unlike the fuzzy set theory, its procedure and forecasts are not understandable and accountable for most of decision makers.Although the second kind of models has been implemented on a real-life problem of crop production and rice production as well as enrolment forecasts, the models have little, if anything, to do with FLRs in the procedures of forecasting.The model obtains high forecasting accuracy by dividing the intervals to produce accurate localizations of the forecasting values.With regard to the third kind of models, the procedures of mining FLRs and forecasting principles are based solely on the FLRs sets.The forecasting procedure and principles are obvious and clear to fuzzy time series researchers and easy to be understood by the decision makers.
For these reasons, this paper proposes a high-order fuzzy time series model based on generalized fuzzy logical relationships [38].The process of creating relationships' matrices and finding out the patterns of time series fluctuations is carried out on the basis of understandable fuzzy rules.Of the above three kinds of models, the proposed belongs to the third.There are three reasons for Hwang's [28] and Chen's models [29,31] to be chosen as the counterparts for comparing the single-factor forecasting results with determinate length of interval.The first reason is that the models of Chen's [29] and Li's [36] are similar in finding the most appropriate forecasting principle with state-transition analysis and backtracking scheme.The second is that the models of Li et al. [36] and Lee et al. [33] aim at multifactor forecasting problems, and the last is because models [9,37] are improved by finding an optimal interval length.As regards the experiment data sets, two data sets were used for the empirical analysis: the enrolments of the University of Alabama and the close price of Shanghai Stock Exchange Composite Index (SSECI).In view of the three criteria of evaluations, the root mean squared error, mean absolute error, and mean absolute percentage error, the proposed method gets more satisfactory forecasts than the counterparts.
The rest of this paper is organized as follows.In Section 2, we briefly review the concepts of fuzzy time series.In Section 3, a new model based on high-order generalized fuzzy logical relationships is implemented on the procedure of forecasting enrolments.In Section 4, we compare the average forecasting accuracy rates of the proposed method with the methods presented in [28,29,31].The effects of parameters on forecasting accuracy are also discussed in this section.Conclusions and future works are given in Section 5.

Preliminaries
In view of making our exposition self-contained, this section reviews some definitions and the framework of fuzzy time series forecasting models.Followed with some related definitions of generalized fuzzy logical relationship, the framework [1][2][3] is summarized in this section.
Definition 1 (see [39,40]).A fuzzy set A of the universe of discourse ,  = { 1 ,  2 , . . .,   }, is defined as follows: where From Definition 5, the fuzzy logical relationship is more general than that of conventional fuzzy time series model.In fact, the logical relationship is that of conventional models when  = 1, and the forecasting rules are obtained from grouping these relationships.We then named it generalized fuzzy logical relationship.According to Definition 4, all fuzzy logical relationships in the training data set can be further grouped together into different fuzzy logical relationship groups according to the same left-hand sides of the fuzzy logical relationship.For given  and , the fuzzy logical relationships can be grouped into  ×  matrices denoted as  (,) ( = 1, 2, . . ., ,  = 1, 2, . . ., ) with the group method proposed by Lee et al. [13].Here,  (,) (,  ∈ {1, 2, . . ., }), the element of matrix  (,) , is the number of fuzzy logical relationships  (−,)  →   .Then, there are  fuzzy logical relationships matrices for a given training data set.To forecast time series with these generalized fuzzy logical relationship matrices, we defined the intersection operation as follows.Definition 6.Let  (,)   represent the LHSs of FLRG in the -order th-principal fuzzy logical relationship at time .Let  (,) (  , ) be the number of FLRG    →   in the th-order th-principal fuzzy logical relationship, ( = 1, 2, . . ., ; ,   = 1, 2, . . ., ).To compute the logical relationships between  FLRGs, the intersection operator ∧  is defined as Based on the above definitions, this paper presents a highorder fuzzy time series model in the following section.

Procedure of 𝐺𝑇𝑆(𝑀, 𝑁).
In this section, we present a new forecasting method based on high-order and generalized fuzzy logical relationships.Since the proposed model is related to the number of orders denoted by  and hierarchies of principal fuzzy relationship denoted by , we name the proposed model (, ).In other words, (, ) means an -order fuzzy time series model based on principal fuzzy logical relationships.
Step 1. Define the universe of discourse and intervals for rules of extraction.The universe of discourse can be defined as U = [starting, ending].According to equal length of intervals, U is partitioned into several intervals equally.For example,  = { 1 ,  2 , . . .,   },   is the midpoint of   whose corresponding fuzzy set is   ( = 1, 2, . . ., ).
Step 2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.The fuzzy set   would be expressed as   = ( 1 ,  2 , . . .,   ), where   ∈ [0, 1], which indicates the membership degree of   in   .The historical and observed data are fuzzified according to the definition of fuzzy sets.For example, a datum is fuzzified to   , when the maximal membership degree of the datum is the th number.In other words, if   = max{ 1 ,  2 , . . .,   }, then the data at time  should be classified into the th class.In this paper, the fuzzy sets are defined with triangular fuzzy function showed by formula (3). Consider The membership degree of the value   at time  in   ( = 1, 2, . . ., ) is defined by formula (4).Consider where   is the observed value at time  and  in is the length of interval.
Step 3. Establish the fuzzy logical relationships based on the orders and hierarchies of principal fuzzy logical relationship.Given the sample data set and the definition of fuzzy sets, all fuzzy logical relationships between two consecutive data can be created.To forecast the time series, the fuzzy logical relationship matrix must be created in this step based on the fuzzy logical relationships.
Then, for a given , there are  forecasts for time .The conclusive forecasting value for time  can be obtained by the following formula: where   ( = 1, 2, . . ., ) is the adjustment parameter for the th forecast; the parameter also can be obtained by minimizing the RMSE or other criteria of evaluations for the training data set.

Computation of GTS(𝑀, 𝑁) on Forecasting Enrollments.
Since most of the conventional models have been presented for forecasting the historical enrolments of the University of Alabama, in this section, we present stepwise procedures of the proposed method for forecasting the time series data with  = 3 and  = 2.The historical enrolments of the University of Alabama from 1971 to 1992 are shown in Table 1.
According to the second principle listed in Section 3.1,  1 (1976) then is  3 , that is, 15500, which is the middle point of  3 .

Data Description.
To demonstrate the effectiveness of the proposed models, amounts of data are needed.Here, the enrolments and SSECI are used as the illustration data sets for the empirical analysis.
There are some causes for the two time series to be the subjects in our experiment.There are two causes for the choice of enrolment data at the University of Alabama.The first one is that most of the fuzzy time series studies have taken this well-known time series as their experiments.Thus, there are a lot of studies that can be used for our reference.The other is that it is simple and easy to display the process of the proposed model.Since time series models have been used to make predictions in the areas of stock price forecasting for many years, the daily SSECI covering the period from 1997 to 2006 is adopted for further experiment.
For the first data, the determined length of the seven intervals is 1000.All of the forecasting results, from one order to ten orders, are compared with those of the conventional fuzzy time series models based on fuzzy logical relationships.
To discuss the effects of parameters  and , the order number and hierarchies of fuzzy logical relationships, on the forecasting results, the ten-order (time-lag periods), from one order to ten orders, models are performed on the second data set with ten different lengths of intervals, that is, 30, 60, 90, . . ., 300.The forecasting results obtained by all of the high-order models are compared in terms of three evaluation criteria reviewed in following subsection.

Criteria of Evaluation.
In statistics, the root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are three typical ways to quantify the difference between values implied by an estimator and the actual values of the quantity being estimated.MSE is a risk function for measuring the average of the squares of the difference.For an unbiased estimator, the MSE is the variance, and the RMSE is the square root of the variance known as the standard error.Furthermore, RMSE is superior to MSE for the reason that its scale is the same as the forecasts.Thus, we take RMSE as the first representative of the size of an average error.As an average of the absolute percent errors, MAPE serves as a criterion for the comparisons of forecasting results in the paper.Some comparisons of accuracy in the forecasted values of our proposed models with other models are made on the basis of the three criteria.

Performance Evaluation.
In Table 4, this study compares the RMSE, MAE, and MAPE forecasting value of the proposed method and the counterparts [28,29,31] on the enrolment experiment.Table 4 shows that the proposed method gets smaller RMSE, MAE, and MAPE than Hwang's model [28], and also smaller than Chen's model [31] of 2011 in most cases.Although there is a fly in the ointment, that is, the proposed model not always gets a higher average forecasting accuracy rate than the model [29] of 2002, these results are improving as the orders increase.In summary, the results suggest that the proposed model obtains better forecasts as  the orders and hierarchies increase.Unlike this point, this is not the trend in the three counterparts.Moreover, we also apply the proposed method to handle forecasting the close price of Shanghai stock index of 2003 with  = 60.The comparison of the three criteria is listed in Table 5.From this table, we can see that the proposed (, ) gets the best forecasts of the three counterparts when  > 5 and  > 2 as well as its better performance than those of Hwang's and Chen's model [28,29] in all cases, and the shortcoming shown in Table 4 is gone.On the whole, there is a "law" that forecasting errors will be reduced when  or  is increased in the proposed model.However, this trend is not evident for the three counterparts by Tables 4 and 5.In Table 4, the three evaluation criteria of Hwang's and Chen's models [28,31] are decreasing while those of model [29] are increasing.In contrast, it is obvious that the three evaluation criteria of Hwang's and Chen's model are increasing while those of Chen's model [29] are decreasing from Table 5.The   use of different data sets seems to be the best reason to account for this contradiction.From these discussions, we get a conclusion that the proposed model's performance is more reasonable and robust than the three counterparts.Tables 4 and 5 depict that (, 3), (, 4), and (, 5) have the same results, although larger  or  is more easy to obtain the smaller forecasting errors.In  fact, this trend is affected by the definitions of fuzzy sets and membership function.As an in-depth analysis, Figure 1 shows three triangular membership functions.The RMSEs of forecasts of 2003 by (, ) with  = 30 and the three membership functions are listed in Table 6.The table tells us that the more intervals that are concerned in the membership  The bold data means the minimum error of the models with the same order.
function, the more different forecasting accuracy obtained.It is more obvious when  is increasing.There is still another conclusion that the forecasts of the third membership function are not always better than those of the first function.
To further investigate the relation between the parameters  and the length of interval, the proposed model has been applied to forecasting stock index close prices covering ten years from 1997 to 2006 with  = 1, 2, . . ., 50 and  = 30, 60, . . ., 300.Since it has been affirmed by Tables 4 and  5 that the characters of MAE, and MAPE are similar to those of RMSE, MAE and MAPE, we will only list the RMSE comparison of these experiments as follows.Some examples of actual values and forecasts of 2003 are depicted in Figures 2 and 3. Figure 2 shows us that (3, 1) has a better performance than (1, 1) in the same length of intervals, that is,  = 30.Figure 3 illustrates that (1, 1) gets the better forecasts when the lengths of intervals are small.Furthermore, some properties will be further described in Figures 4 and 5 which depict the mean forecasting errors of the ten years.Figure 4 shows the relation between the  RMSEs and lengths of intervals with different orders, and Figure 5 shows the relation between the RMSEs and orders with different lengths of intervals.
From Figure 4, it is clear that the longer the interval, the bigger the RMSE, and the higher order models are better than the lowers.This conclusion also was testified by Figure 5 which shows us another important message that the shorter length of intervals can result in robuster forecasts.Overall, these conclusions are important for the proposed mode to be applied on other data set or area.

Conclusion
After discussing the high-order fuzzy time series models and presenting the definition and operation for generalized fuzzy logical relationship, we have proposed a novel highorder fuzzy time series models based on the new relationship.The work is driven by the three main reasons.Firstly, it is urged to generalize the fuzzy logical relationship by the advanced fuzzy time series models.Secondly, it is to abstract the relationships matrices among time series and find out the patterns of time series fluctuations based on understandable fuzzy rules.The last one is to make the fuzzy time series model explain more complex relationships.
By using the enrolment of the University of Alabama and close price of Shanghai Stock Exchange Composite Index as data sets for evaluating the models, the experimental results give two conclusions: (1) the performance of (, ) is more reasonable than the three conventional fuzzy time series models proposed earlier by Hwang et al. [28] and Chen and Chen [29,31]; (2) the number of orders and principal fuzzy logical relationship affect the forecasting result slightly.The higher the order, the better the forecasting results, the more hierarchy principal fuzzy logical relationships, the less forecasts error but not infinite decreasing.
In the future research, some suggestions are provided to improve this paper.The relation between the principal fuzzy relationships and the conventional fuzzy relationships needs to be further discussed.For example, what's the effect brought by exchange the definition of membership function and the operations of principal fuzzy logical relationship?How great this kind of effect?Since the proposed model is on the basis of fuzzy logical relationship, a generalized fuzzy relationship, study work is worth devoting into improvement of the model hybridized with some advanced algorithms.

Figure 1 :
Figure 1: Three membership functions for forecasting SHI of 2003.

Figure 2 :
Figure 2: Actual values and forecasts of 2003 with the same L.

Figure 3 :
Figure 3: Actual values and forecasts of 2003 with different L.

Figure 4 :
Figure 4: Comparison of RMSEs with different lengths of intervals.

Figure 5 :
Figure 5: Comparison of RMSEs with different orders of the model.

Table 1 :
Membership degrees of enrollment with respect to fuzzy sets.

Table 3 :
Forecasts of enrollment year actual data.

Table 4 :
Comparison of forecasting enrollment.
The bold data means the minimum error of the models with the same order.

Table 6 :
Effects of membership function.