A Hybrid Fuzzy Time Series Approach Based on Fuzzy Clustering and Artificial Neural Network with Single Multiplicative Neuron Model

Particularly in recent years, artificial intelligence optimization techniques have been used to make fuzzy time series approaches more systematic and improve forecasting performance. Besides, some fuzzy clustering methods and artificial neural networks with different structures are used in the fuzzification of observations and determination of fuzzy relationships, respectively. In approaches considering themembership values, themembership values are determined subjectively or fuzzy outputs of the system are obtained by considering that there is a relation betweenmembership values in identification of relation.This necessitates defuzzification step and increases themodel error. In this study, membership values were obtainedmore systematically by using Gustafson-Kessel fuzzy clustering technique.The use of artificial neural network with single multiplicative neuron model in identification of fuzzy relation eliminated the architecture selection problem as well as the necessity for defuzzification step by constituting target values from real observations of time series. The training of artificial neural network with single multiplicative neuron model which is used for identification of fuzzy relation step is carried out with particle swarm optimization. The proposed method is implemented using various time series and the results are compared with those of previous studies to demonstrate the performance of the proposed method.


Introduction
Nowadays, it is of vital importance to make predictions about the future in terms of planning and strategy formulation.This can be realized by accurate and realistic analysis of information and data that have emerged from past to present.Different approaches, namely, stochastic and nonstochastic approaches, have been proposed in the literature for the analysis of time series.Nowadays, the use of nonstochastic models such as fuzzy time series approach for the analysis of time series has become widespread.In some cases, expressing the observations of time series by linguistic values or fuzzy sets is more realistic.These types of time series are called fuzzy time series and their analysis should be made via fuzzy time series analysis methods rather than traditional ones.In recent years, to analyse the nonlinear time series such as time series of 1/f noise time series, Li et al. [1], Li et al. [2], and Li and Zhao [3] presented different approaches which are expressed as stochastic models.In addition, Li et al. [4] stated that a sufficient condition for 1/f noise type time series to be predictable is that variance of its predications errors exists and described that there are some challenges in prediction of 1/f noise type time series.The main advantage of fuzzy time series approaches is that they do not need assumptions that stochastic models do.Particularly, since fuzzy time series methods do not need linear model assumption and probability distribution assumption, they can be effectively used to analyse the nonlinear time series which is frequently encountered in the real-world problems.
The concept of fuzzy time series was first introduced by Song and Chissom [5] based on fuzzy set theory proposed by Zadeh [6].Fuzzy time series can be evaluated under two main headings as time-variant and time-invariant.Song and Chissom [5] have reported that internal relations belonging to fuzzy time series are supposed to change over time in time-variant fuzzy time series but not in time-invariant ones.

Mathematical Problems in Engineering
Song and Chissom [7] proposed an algorithm for the solution of time-invariant fuzzy time series which are the subject of almost all the studies in the literature.As the subject of this study is time-invariant fuzzy time series, in the remainder of the paper the term of "fuzzy time series" will be used instead of "time-invariant fuzzy time series." As in fuzzy inference systems, fuzzy time series forecasting models consist of three steps as fuzzification, identification of fuzzy relation, and defuzzification which have an influence on forecasting performance of the method.Many researchers have carried out studies using different approaches on these three steps.
Universe of discourse has been used in fuzzification step until recently.Song and Chissom [5,7,8] and Chen [9,10] determined fixedly interval lengths arbitrarily whereas Huarng [11] used average and distribution-based and Egrioglu et al. [12] used optimization-based methods.In addition, for the analysis of time series containing trend, a ratio-based length of intervals is proposed by Huarng and Yu [13].Furthermore, Yolcu et al. [14] proposed a new approach and used a single-variable constrained optimization to determine the ratio for the length of intervals which change in time in partition of universe of discourse.More recently, Kuo et al. [15,16], Davari et al. [17], Park et al. [18], Hsu et al. [19], and Egrioglu et al. [20] used particle swarm optimization whereas Chen and Chung [21] and Lee et al. [22,23] proposed methods using genetic algorithms for determination of the changing length of intervals.
Although subjective judgments are avoided in these studies using optimization techniques, membership values are still determined subjectively and all membership values are not considered.The problem that membership values are determined subjectively may eliminate by using some fuzzy clustering techniques.In this regard, Cheng et al. [24], Li et al. [25], Aladag et al. [26], Alpaslan et al. [27], Egrioglu et al. [12,28], and Alpaslan and Cagcag [29] eliminated by using fuzzy C-means (FCM) and Gustafson-Kessel fuzzy clustering techniques, respectively.
Identification of fuzzy relation is the step in which the appropriate model is determined.Therefore, this step plays the most important role in forecasting performance.In this stage, Song and Chissom [5,7,8] used fuzzy relation matrix and represented the fuzzy logic relations with only one matrix.Sullivan and Woodall [30] used transition matrices based on Markov chain instead of using fuzzy logic relation matrix.Chen [9] proposed a simpler approach using fuzzy logic group relationships tables by claiming that matrix calculations are based on complex processes.The approach proposed by Chen [9] is the most commonly used approach in the literature.Huarng and Yu [31] proposed a first-order fuzzy time series approach which uses feedforward neural networks (FFANN) in this step.Aladag et al. [32] developed the approach proposed by Huarng and Yu [31] and proposed a high-order fuzzy time series forecasting model which uses FFANN in the determination of fuzzy relations.In all of these approaches, when determining the fuzzy relations representing the internal relation of fuzzy time series, only the fuzzy set having the highest membership value was considered and membership values were ignored.Although Yu and Huarng [33] proposed an approach which considers the membership values, their approach has determined membership values subjectively.Alpaslan et al. [27] and Yolcu et al. [34] used FCM technique instead of determining the membership values subjectively.The use of ANN in identification of fuzzy relations has many advantages and disadvantages as well.Determination of unit number in hidden layer (architecture structure) and excessive number of parameters to be used during the analysis are the most prominent ones.Although Aladag [35] eliminated this problem by using artificial neural network with single multiplicative neuron model (SMNM-ANN) in the determination of fuzzy relations, membership values were not considered.Nevertheless, as the system output of these approaches consists of fuzzy set number or membership values, fuzzification step is necessary.This may be a factor that increases the model error.An approach not requiring defuzzification step would eliminate forecasting error that may occur in this step and improve the performance of the method.
Almost all approaches proposed in the literature focus on autoregresive (AR) model; in other words, in these approaches it is supposed that time series is affected by only its own lagged variables.Otherwise, there are various approaches which included autoregressive moving average (ARMA) model such as the method proposed by Egrioglu et al. [20] and seasonal autoregressive moving average (SARIMA) model such as methods proposed by Egrioglu et al. [36], Uslu et al. [37], Aladag et al. [38], and Alpaslan et al. [27].
The proposed method uses Gustafson-Kessel fuzzy clustering technique in fuzzification step and membership values are obtained more systematically.The use of SMNM-ANN in identification of fuzzy relations eliminates architecture selection problem and the need for defuzzification step by constituting the target values from observations of the real-time series.The training of SMNM-ANN which was used in the determination of fuzzy relations is carried out with particle swarm optimization.The proposed method comprises firstorder fuzzy time series model and it can be referred to as an AR model.Main differences of proposed method from previous studies are that it does not need the defuzzification stage and also identification of architecture of ANN.
The rest of this paper is designed as follows.In Section 2, the basic concepts of fuzzy time series are briefly reviewed.In Section 3, PSO, Gustafson-Kessel fuzzy clustering technique, and SMNM-ANN are briefly presented under the related methods main heading.In Section 4, we introduce new hybrid fuzzy time series method.In Section 5, we apply the proposed method to different time series and make a comparison of the forecasted results of the proposed method with that of the existing methods.In the last section, the conclusions are discussed.

Fuzzy Time Series
The fuzzy time series was firstly introduced by Song and Chissom [5].The fuzzy time series and time-variant and timeinvariant fuzzy time series definitions are given below by Song and Chissom [5].
Song and Chissom [7] firstly introduced an algorithm based on the first-order model for forecasting time-invariant ().In Song and Chissom's work [7], the fuzzy relationship matrix (,  − 1) =  is obtained by many matrix operations.The fuzzy forecasts are obtained based on max-min composition as follows: The dimension of  matrix is dependent number of fuzzy sets which are partition number of universe and discourse.If we want to use more fuzzy sets, we need different matrix operations to obtain  matrix.

Particle Swarm Optimization (PSO).
Particle swarm optimization, which is a population-based heuristic algorithm, was firstly proposed by Eberhart and Kennedy [39].Distinguishing feature of this heuristic algorithm is that it simultaneously examines different points in different regions of the solution space to find the global optimum solution.Local optimum traps can be avoided because of this feature.
In the literature, it was shown that using some timevarying parameters can increase the convergence speed of the algorithm.Ma et al. [40] employed time-varying acceleration coefficient in standard particle swarm optimization method.In another study, Shi and Eberhart [41] used time-varying inertia weight.In the modified particle swarm optimization, this time-varying constituents are used together.This is the only difference between standard and modified particle swarm optimization methods.Algorithm 4. The modified particle swarm optimization.
Step 1. Positions of each th, ( = 1, 2, . . ., ) particles' positions are randomly determined and kept in a vector   given as follows: where    ( = 1, 2, . . ., ) represents th position of th particle. and  represent the numbers of particles in swarm and positions, respectively.
Step 2. Velocities are randomly determined and stored in a vector   as follow: Step 3.According to the evaluation function, best and best particles given in (4), respectively, are determined: where best is a vector which stores the positions corresponding to the th particle's best individual performance and best represents the best particle, which has the best evaluation function value, found so far.
Step 4. Let  1 and  2 represent cognitive and social coefficients, respectively, and  is the inertia parameter.Let ( 1 ,  1 ), ( 2 ,  2 ), and ( 1 ,  2 ) be the intervals which include possible values for  1 ,  2 , and , respectively.At each iteration, these parameters are calculated by using the following formulas: where max  and  represent maximum iteration number and current iteration number, respectively.
Step 5. Values of velocities and positions are updated by using the following formulas.
where rand 1 and rand 2 are random values from the interval [0 1].
Step 6. Steps 3 to 5 are repeated until a predetermined maximum iteration number (max ) is reached.

The Gustafson-Kessel Fuzzy Clustering
Technique.The algorithm of Gustafson-Kessel fuzzy clustering is firstly proposed by Gustafson and Kessel [42].Let Σ  be the covariance matrix of the cluster,   the center of the th cluster,   the membership degree, and  fuzziness index.For the th cluster, its associated Mahalanobis distance is defined as The covariance matrices are computed as follows: The objective function is defined as The objective function (, , Σ, ) is, then, minimized under the following constraints: In this minimization problem, the center   and the membership degrees   are updated according to the expressions given below: 2/(−1) . (12)

Single Multiplicative Neuron Model.
In neurons of feedforward neural networks, the input signal is calculated based on addition function.Yadav et al. [43] proposed a single multiplicative neuron model.In the model, the input signal of the neuron is estimated by the multiplication function.Yadav et al. [43] showed that single multiplicative neuron model gives better forecasting performance for time series forecasting.Zhao and Yang [44] recommended the use of PSO instead of backpropagation learning algorithm proposed by Yadav et al. [43] in the training of single multiplicative neuron model.The structure of single multiplicative neuron model for 5 inputs is given in Figure 1.This model has a single neuron, and unlike feed forward neural network, multiplication is performed to the signal coming into the neuron.Function Ω(, ) is the product of the weighted inputs.The multiplicative neural model with five inputs given in Figure 1 (  ,  = 1, 2, . . ., 5) has 10 weights.Of these, five are the weights corresponding to the inputs (  ,  = 1, 2, . . ., 5) and five to the sides of the weights (  ,  = 1, 2, . . ., 5).Suppose that activation function is taken as logistic given below: In this case, the net value of the neuron is obtained as follows: Thus, as the net value passes through activation function, output of the weight is obtained as  = (net).The fitness function to be calculated during the training of multiplicative neuron model with PSO can be used as a criterion as the sum of squares which was calculated from the difference between output values for all learning samples and target values: where   and   represent the target value and the output of the network corresponding to th learning sample.

Proposed Method
In fuzzy time series approaches, each stage plays a decisive role in the forecasting performance of the method.Many studies on these steps have been conducted in the literature.
As well as more systematical approaches in fuzzification step, flexible and superior calculation abilities of ANN in identification of fuzzy relation have been widely used recently.These studies have many advantages and disadvantages as well such as determination of unit number in hidden layer (architecture structure), identification of membership values subjectively, and excessive number of parameters to be used during the analysis.In this study, it was aimed to propose a model which is free from all these problems.In the proposed method, membership values were obtained more systematically by using Gustafson-Kessel fuzzy clustering technique in fuzzification step.The use of SMNM-ANN in identification of fuzzy relation eliminated architecture selection problem and the necessity for defuzzification step by constituting target values from real observations of time series; thus, the forecasting performance of the method was improved.The training of artificial neural network with single multiplicative neuron model which was used in identification of fuzzy relations is carried out with particle swarm optimization.The main advantages of the proposed method can be summarized as follows.
(i) With the use of fuzzy clustering method in fuzzification step, subjective judgments are not needed anymore.The algorithm of the proposed method is given below in steps.
Step 1.For 2 ≤  ≤ , where  is the number of fuzzy sets, Gustafson-Kessel algorithm is applied to the crisp time series.The centers of fuzzy sets and membership degrees, which are calculated for every observation according to this center, are obtained.Finally, ordered fuzzy sets,   ,  = 1, 2, . . ., , are obtained according to the ascending order centers, which are denoted by V  ,  = 1, 2, . . ., .
For better understanding, we consider a time series data with 8 observations such as 20, 30, 40, 30, 20, 50, 60, and 80.Let , the number of fuzzy sets, be 3.When we applied the method of Gustafson-Kessel to this data, the centroid of the fuzzy sets and the membership degrees of each observation, which denote the belonging degree of that observation to the related fuzzy set, are given in Table 1.According to Table 1, the membership degree of belonging to the second fuzzy set ( 2 ) of the first observation ( = 1) is   2 ((1)) = 0.0293.
Step 2. Define the fuzzy relationship with SMNM-ANN.
The number of inputs of SMNM-ANN, used for determining fuzzy relationships, is equal to the number of fuzzy  sets ().The architecture of the network is shown in Figure 2.
In Figure 2,    (( − 1)) denotes the membership degree of belonging to th fuzzy set of related observation of time series (−1).Then, the target values of SMNM-ANN are real observation of time series at  while the inputs of the networks are every membership degree of belonging to  fuzzy sets of the observation of time series at  − 1.
For example, suppose that we consider the time series given in Table 1.When we defined the architectural structure as given in Figure 2, the input and the targets of ANN would be as in Table 2.
Function Ω is comprised of multiplication of the weighted inputs and is obtained by (16), where  is the activation function and X() is the output of the model.The output of the model is calculated as in (17): In the case where the number of fuzzy sets defined for the fuzzification process is , there are 2 ×  variables to be optimized by PSO.The position of these variables for a particle can be shown as in Figure 3, where   ,  = 1, 2, . . ., , and   ,  = 1, 2, . . ., , are weights and biases of SMNM-ANN, respectively.
The training SMNM-ANN given in Figure 2 is carried out via PSO with the following substeps.

Biases of SMNM-ANN
Step 2.2.Starting positions of the variables to be optimized by PSO are randomly generated.Positions of each th ( = 1, 2, . . ., ) particle's positions and velocities are randomly determined and kept in vectors   and   given as follows: where  , ( = 1, 2, . . ., 2 × ) represent th position of th particle for weights and biases of SMNM-ANN. and  = 2× represent the number of particles in swarm and positions, respectively.The initial positions and velocities of each particle in a swarm are randomly generated from uniform distribution (0, 1) and (−V, V), respectively.
Step 2.3.Evaluation function values for each particle are computed.Root mean square error (RMSE) given below is used as evaluation function: where  represents the number of learning sample for SMNM-ANN and () and X() are real observation and forecasting of time series at , respectively.
Step 2.4.best  , ( = 1, 2, . . ., ) and best are determined according to evaluation function values calculated in the previous step.best  is a vector stores the positions corresponding to the th particle's best individual performance, and best is the best particle, which has the best evaluation function value, found so far: function values for each particle are computed.
Step 2.5.New values of positions and velocities are calculated.New values of positions and velocities for each particle are computed by using the following formulas: where rand 1 and rand 2 are randomly generated from uniform distribution (0, 1).
Steps 2.1-2.5 are repeating the number of maximum iteration times.Finally, the elements of best are taken as the optimal solution.

Applications
The proposed method was applied to five different time series, namely, Taiwan stock index (TAIEX) in years 2000, 2001, 2002, 2003, and 2004.In the analysis of TAIEX, we used observations of the last three months as the out-of-sample observations (test data).Therefore, we carried out five different analyses to evaluate of performance of the proposed method.
In the implementation of the proposed method, a new time series which was constituted from first-order differences of time series rather than time series was used as in Yu and Huarng's study [33].The creation of new time series can be summarized as follows.
Firstly, the differences between every two consecutive observations at  and  − 1 are obtained: The differences may turn out to be negative.To ensure that all the universes of discourse are positive, we add different positive constants to the differences for different years: For the year 2004, the minimum of all the differences is −455.17.Hence, 500 is considered to be appropriate as the constant for the year 2004: Moreover, the outputs from the SMNM-ANN are the forecasted for the next difference.For example, when the forecasted difference between 10/4 and 10/5 is obtained as (2, 3), ( 1 ,  2 ) = (0.4,0.9), and V = 10.RMSE criteria were used in the evaluation of the results obtained by the analyses and the other methods in the literature.
The optimal results are obtained from nine, thirteen, six, seven, and five fuzzy sets for TAIEX data of years 2000, 2001, 2002, 2003, and 2004, respectively.Prediction error for the optimal results obtained from the proposed method as well as prediction error of other fuzzy time series methods is presented in Table 3.
Considering Table 3, it can be concluded that forecasting performances of the proposed method for all TAIEX data are better than those found in the literature with respect to RMSE criterion.

Conclusions and Discussion
It is of vital importance to make predictions about the future in terms of planning and strategy formulation.This can be realized by accurate and realistic analysis of information and data that have emerged from past to present.Expressing observations of time series with linguistic and fuzzy clusters and analyzing these types of time series via fuzzy time series methods rather than conventional ones would provide more realistic approaches and more accurate outcomes.
Many studies aiming at making fuzzy time series more systematic approaches have been introduced in the literature.Therefore, some fuzzy clustering methods and artificial neural networks with different structures are used in the fuzzification of observations and determination of fuzzy relationships, respectively.Considering membership values especially in identification of fuzzy relations seems to be a factor that improves the forecasting performance of the method.In approaches considering the membership values, the membership values are determined subjectively or fuzzy outputs of In conclusion, considering the advantages and the superior forecasting performance of the method proved via different solutions, it can be argued that the proposed method would be applicable and make contributions to the fuzzy time series literature.In the future studies, proposed method can be extended to the high order structure.Moreover feedback mechanism can be added to model like moving average terms in ARMA.

Figure 1 :
Figure 1: The structure of single multiplicative neuron model.

Figure 3 :
Figure 3: The structure of a particle.

Table 1 :
An example of fuzzification.

Table 2 :
An example of determine fuzzy relation.

Table 3 :
[47]ormance evaluation of methods for RMSE criteria.Chang (2010)[46]129.42 113.33 66.82 53.51 60.48Chen and Chen (2011)[47]123.62 115.33 71.01 58.06 57.73 Chen et al. (2012) [48] 119.98 114.47 67.17 52.4952.27The Proposed Method 99.19 98.53 59.34 41.25 44.15 the system are obtained by considering that there is a relation between membership values in identification of relation.This necessitates defuzzification step and increases model error.The study aimed to overcome all these problems.For this purpose, membership values were obtained more systematically by using Gustafson-Kessel fuzzy clustering technique in fuzzification step.In identification of fuzzy relations, problems such as architecture selection were eliminated by using artificial neural network with single multiplicative neuron SMNM-ANN and defuzzification step is no longer needed by constituting target values with real values of time series.The training of artificial neural network with single multiplicative neuron model is carried out with particle swarm optimization.Main differences of proposed method from previous studies are that it does not need the defuzzification and also identification of architecture of artificial neural network.