An ARMA Type Fuzzy Time Series Forecasting Method Based on Particle Swarm Optimization

,


Introduction
Fuzzy time series firstly proposed by Song and Chissom [1] can be divided into two subclasses time variant and time invariant.Fuzzy time series method generally embodies three stages such as fuzzification, determination of fuzzy relations, and defuzzification stages.In the fuzzification stage, observations of time series are fuzzified.Fuzzy relations between the observations are defined in the stage of determination of fuzzy relations.Finally, the calculated fuzzy forecasts are defuzzified in the defuzzification phase.If one or more of these stages can be improved, the performance of the method will increase.Therefore, new fuzzy time series approaches have been proposed by making contributions to these stages in the literature.
Establishing fuzzy relations plays important role in the forecasting performance of the method.In this phase, Song and Chissom [1] utilized fuzzy relationship matrix, and Sullivan and Woodall [26] used transition matrices based on Markov chain instead of fuzzy relationship matrix.Chen [4] suggested an approach in which fuzzy logic group relationship tables are employed to define fuzzy relations, and Cheng et al. [24], Huarng [5], Huarng and Yu [11], Yu [27], and Egrioglu et al. [21] also used fuzzy logic group relationship tables.Huarng and Yu [28] and Aladag et al. [29] preferred to utilize feed forward artificial neural networks in this stage.
Aladag et al. [30] used a different type of artificial neural networks which is Elman recurrent neural networks.In the determination of fuzzy relations, Yu and Huarng [31] and Yolcu et al. [32] proposed different approaches in which feed forward artificial neural networks using membership values are used.Egrioglu [33] utilized genetic algorithms to define fuzzy relations.Moreover, many soft computing techniques have been used for forecasting in the literature.Yang et al. [34], and Yang et al. [35], and Yang et al. [36] are some of them.
In all of the studies mentioned previously, fuzzy time series forecasting models have autoregressive (AR) structure.However, many real life fuzzy time series can contain both autoregressive and moving average (MA) structures.Using forecasting models that have only AR structure for these time series can produce insufficient results.To analyze such time series, it is needed to use an ARMA type fuzzy time series forecasting model that includes both AR and MA structures.In this study, a novel fuzzy time series forecasting approach based on ARMA type fuzzy time series forecasting model is proposed to increase forecasting accuracy.Some fuzzy time series approaches in which fuzzy logic group relationship tables are employed disregard the fuzzy set theory since fuzzy sets' elements whose membership value is 1 are only taken into account in the fuzzification phase.Therefore, information loss occurs, and it is expected to decrease the explanation power of the model.In order to overcome these problems, the particle swarm optimization method is employed in the proposed approach to establish fuzzy relations.In addition, the computational cost of the proposed approach is very reasonable since it does not need to perform complex matrix operations when the particle swarm optimization method is used.Another advantage of the proposed method is that it uses the fuzzy -means clustering method in the fuzzification phase.In the fuzzification phase of the fuzzy time series approach, there are some common problems such as the decision of the number of intervals, arbitrary determination of interval length, and arbitrary choice of degrees of membership.Since the fuzzy -means clustering technique is employed in the proposed approach, it works in a systematic way.
In the real life, some time series have long-range dependence.Long-range-dependent time series is forecasted by using special linear models.ARFIMA models, which are used to forecast long-range-dependent time series, were firstly introduced by Granger and Joyeux [37].General properties of ARFIMA models were given by Hosking [38,39] and Beran [40].First studies are concerned with estimation of fractional differencing parameter in fractional white noise processes./ statistic was proposed in Hurst [41].Egrioglu and Gunay [42], and Li [43], Li and Zhao [44] are important papers about fractional processes.In this study, time series are examined by using / hypothesis test to determine long-range dependence in the application.The performance of proposed method is examined according to long-range dependence.
The next section presents basic definitions of fuzzy time series and fuzzy-c means clustering method.In Section 3, the modified particle swarm optimization algorithm is given.Section 4 introduces the proposed approach based on ARMA type fuzzy time series forecasting model and particle swarm optimization.The implementation is given in Section 5. Finally, the last section concludes the paper.

Fuzzy Time Series Basic Definitions and Fuzzy 𝐶-Means Clustering
The fuzzy time series was firstly introduced by Song and Chissom [1].Song and Chissom [2] firstly introduced an algorithm based on the first-order model for forecasting time invariant ().In Song and Chissom [2], the fuzzy relationship matrix (,  − 1) =  is obtained by many matrix operations.The fuzzy forecasts are obtained based on max-min composition as in what follows: The dimension of  matrix depends on the number of fuzzy sets.The number of fuzzy sets equals the number of intervals that is composed of universe of discourse.The more fuzzy sets are used, the more matrix operations are needed to obtain  matrix.When the number of fuzzy set is high, using the method proposed by Song and Chissom [2] considerably increases the computational cost.
For fuzzification, partition of universe of discourse method is used in the method proposed by Song and Chissom [1].However, there are several problems related to the decomposition of universe of discourse.These problems are the determination of the number of intervals, arbitrarily choice of interval length, and membership degrees.In order to deal with these problems, Cheng et al. [19] and Li et al. [20] used fuzzy c-means clustering method for fuzzification.The fuzzy c-means clustering method was firstly introduced by Bezdek [45].This clustering method is the most widely used one.In this method, fuzzy clustering is conducted by minimizing the least squared errors within groups.Let   be the membership values, V  the center of cluster,  the number of variables, and  the number of clusters.Then, the objective function, which is minimized in fuzzy clustering, is where  is a constant ( > 1) and called the fuzzy index.(  , V  ) is a similarity measure between an observation and the center of corresponding fuzzy cluster.The objective function   is minimized subject to constraints given in what follows: In fuzzy -means clustering method, to solve the minimization problem given previously, an iterative algorithm is used.In each iteration, the values of V  and   are updated by using the formulas given in ( 4) and ( 5), respectively: 2/(−1) . (5)

The Particle Swarm Optimization
Particle swarm optimization, which is a population-based heuristic algorithm, was firstly proposed by Kennedy and Eberhart [46].Distinguishing feature of this heuristic algorithm is that it simultaneously examines different points in different regions of the solution space to find the global optimum solution.Local optimum traps can be avoided because of this feature.In a few fuzzy time series studies, particle swarm optimization method has been exploited in fuzzification phase.While the particle swarm optimization method was employed by Davari et al. [12] for fuzzification in the firstorder fuzzy time series forecasting model, Kuo et al. [14] utilized the method in high-order models.In the fuzzification phase, Kuo et al. [15] utilized the particle swarm optimization method in both the first-and the high-order models.Park et al. [16] used the same method for fuzzification in a two-factor high-order model.However, the particle swarm optimization method has never been used to establish fuzzy logic relations in the literature.In time variant fuzzy time series forecasting method proposed in this study, the modified particle swarm optimization whose algorithm is given later is exploited.The modified particle swarm optimization algorithm has time varying inertia weight like in Shi and Eberhart [47].In a similar way, this algorithm also has time varying acceleration coefficient like in Ma et al. [48].
Algorithm 4. The algorithm of the modified particle swarm optimization.
Step 1. Positions of each th ( = 1, 2, . . ., ) particles' positions are randomly determined and kept in a vector   given as follows: where    ( = 1, 2, . . ., ) represents th position of th particle. and  represent the number of particles in a swarm and positions, respectively.
Step 2. Velocities are randomly determined and stored in a vector   given in what follows: Step 3.According to the evaluation function, best and best particles given in ( 8) and ( 9), respectively, are determined: where best  is a vector that stores the positions corresponding to the th particle's best individual performance and best represents the best particle, which has the best evaluation function value, found so far.
Step 4. Let  1 and  2 represent cognitive and social coefficients, respectively, and  is the inertia parameter.Let ( 1 ,  1 ), ( 2 ,  2 ), and ( 1 ,  2 ) be the intervals which include possible values for  1 ,  2 , and , respectively.At each iteration, these parameters are calculated by using the formulas given in (10): where max  and  represent maximum iteration number and current iteration number, respectively.
Step 5. Values of velocities and positions are updated by using the formulas given in ( 11) and ( 12), respectively.
where rand 1 and rand 2 are random values from the interval [0 1].
Step 6. Steps 3 to 5 are repeated until a predetermined maximum iteration number (max ) is reached.
We would like to note that the aim of this study is not to propose a new particle swarm optimization algorithm.In the literature, it was shown that using some time varying parameters can increase the convergence speed of the algorithm (Shi and Eberhart [47]; Ma et al. [48]).Therefore, we just used both the time varying acceleration coefficient (Ma et al. [48]) and the time varying inertia weight (Shi and Eberhart [47]) in the standard particle swarm optimization as mentioned previously.Then, this modified particle swarm optimization method was utilized to calculate membership values in the fuzzy relationship matrix as addressed in the next section.It is not claimed in the paper that a better particle swarm optimization algorithm was improved.

The Proposed Approach
The fuzzy time series forecasting methods in the literature have only AR structure that includes lagged fuzzy variables.On the other hand, it will be wiser to use a model which includes both AR and MA structures in order to forecast real life time series that contain both structures.In the literature, we firstly define a new model which includes both AR and MA terms in Definition 5. Definition 5. Let fuzzy time series () be caused by lagged fuzzy time series (−1) and lagged absolute fuzzy error series ( − 1); then the relationship can be expressed as This model given in (13) is called an ARMA(1,1) fuzzy time series forecasting model.Thus, definition of fuzzy relations for this model is presented in Definition 6.
In this study, a new fuzzy time series forecasting approach is suggested to forecast ARMA(1,1) fuzzy time series forecasting model in (13).The proposed approach provides some advantages which can be given as follows.
(i) Since the model given in (13) is employed in the proposed approach, it is more proper to use this method for forecasting real life time series that contain both AR and MA structures.(ii) Since fuzzy -means clustering method is utilized for fuzzification in the proposed approach, the number of intervals, the interval length, and the degrees of memberships is not arbitrarily determined in the fuzzification phase, so the proposed approach is a systematic method.
(iii) The methods in the literature that exploit fuzzy logic group relationship tables disregard the fuzzy set theory since only some elements of fuzzy sets are taken into account in the fuzzification phase.Thus, information loss occurs and it is expected to decrease the explanation power of the model.However, in the proposed method, particle swarm optimization method is used to calculate membership values in the fuzzy relationship matrix, so information loss and decrease in explanatory power of the model are prevented.
(iv) In the proposed method, instead of using centroid method, a method which considers all membership degrees is employed.Therefore, information loss that occurs in the defuzzification stage is avoided.
(v) Since the proposed approach provides the advantages given previously, it is expected that the suggested method has high forecasting accuracy.
The algorithm of the proposed method is given later step by step.
Step 1. Fuzzify time series by using the fuzzy -means clustering, and define fuzzy sets for absolute values of errors.
Let  be the number of fuzzy set, such that 2 ≤  ≤ .The fuzzy c-means clustering algorithm in which the number of fuzzy sets is  is applied to the time series which consists of crisp values.After this application, the center of each fuzzy set is determined.Then, the degrees for each observation, which denote a degree of belonging to a fuzzy set for that observation, are calculated with respect to the obtained values of center of fuzzy sets.Finally, ordered fuzzy sets,   ( = 1, 2, . . ., ), are obtained according to the ascending ordered centers, which are denoted by V  ( = 1, 2, . . ., ).
Fuzzy sets for absolute values of errors are defined in accordance with predetermined length of interval.Subintervals that contain absolute values of errors are generated.The number of clusters is taken as  which equals the number of cluster of time series.Thus, the number of subintervals is .The upper bound of the last subinterval is always open.While the first subinterval has the minimum absolute error, the subsequent subintervals have comparatively greater absolute errors.Generally, subintervals can be defined by where  represents the length of interval.In accordance with these subintervals, fuzzy sets for absolute errors are defined as follows: The structure of a particle.
Step 2. Determine the parameters of particle swarm optimization.Some parameters of particle swarm optimization are possible intervals for social coefficient ( 1 ,  1 ), cognitive coefficient ( 2 ,  2 ), and inertia weight ( 1 ,  2 ).The other parameters are swarm size () and maximum iteration number (max ).Evaluation function is RMSE criterion which is computing as in what follows: where   is crisp time series, x is defuzzified forecasts, and  is the number of forecasts.
Step 3. Generate a random initial population.
Step 4. Calculate the evaluation function values of all particles in the current swarm.
The method for calculating RMSE for any particle is given in Steps 4.1, 4.2, and 4.3.
Step 4.1. 1 and  2 fuzzy relation matrices are designed from positions of particles.Each  position sets are the rows of  1 and  2 matrices.
Step 4.2.Fuzzy forecasts are calculated by using (14).After  1 and  2 relation matrices are estimated, fuzzy forecasts can be obtained by using fuzzy errors and fuzzy observations.When the first forecast is calculated, it is assumed that the error equals to zero since this assumption is also valid in conventional time series analysis.
For instance, some observations of a time series   , corresponding membership values of these observations, and the center values (V  ,  = 1, 2, 3, 4, 5) for each fuzzy set are given in Table 1.
Let the relations matrices be as follows: Thus, fuzzy forecast for 1972 can be calculated as follows: After this procedure, fuzzy forecasts are defuzzified by using the rules given in what follows.
(i) If the memberships of fuzzy forecast have only one maximum, then select the center of this cluster as the defuzzified forecasted value.
(ii) If memberships of fuzzy forecast have two or more consecutive maximums, then select arithmetic mean of centers of corresponding clusters as the defuzzified forecasted value.
(iii) Otherwise, standardize the fuzzy output, and use the center of the fuzzy sets as the forecasted value.
For example, defuzzified forecast for 1972 can be calculated as follows.Since the memberships of fuzzy forecast have only one maximum, defuzzified forecast value is taken as the center value (V 1 = 13000) of the first fuzzy set which has the maximum membership value.This can be expressed by Then, absolute error value for 1972 is calculated given as follows: Absolute error value is fuzzified by using corresponding fuzzy set which has the maximum membership value for the interval which includes this absolute error value.For instance, for the time series given in Table 1, when the interval length of absolute error is taken as 50, 5 subintervals are as follows: Since the absolute error value 55 is in the second interval  2 , defuzzified absolute error is  2 .Thus, for 1972, the absolute error is obtained given as follows: Then, the forecast for 1973 can be calculated as follows: In the same way, other forecasts for other years can be obtained.
Step 5. Update cognitive coefficient  1 , social coefficient  2 , and inertia parameter  at each iteration by using formulas in (10).
Step 6.New positions of the particles are calculated by using the formulas given in (11) and (12).
Step 7. Repeat Steps 4 to 7 until maximum iteration bound (max ) is reached.

Application of the Proposed Method
Firstly, the proposed method was applied to Alabama University enrollment data (1971-1992) that is well-known data in the literature.The enrollment data is presented in Table 2.
The algorithm of the proposed method was coded in MATLAB 7.9 version.In the application of the proposed method, we used seven fuzzy sets ( = 7) in Step 1 since the same number has been employed in other studies available in the literature (Cheng et al., 2008).The centers of clusters and the membership values which are obtained from FCM algorithm in Step 1 are given in Tables 3 and 4, respectively.In the fuzzification process of the absolute error, if the interval length is too small, aggregation will arise in the last fuzzy set.If it is too big, aggregation will occur in the first fuzzy set.Therefore, the length of interval for absolute error is picked as 300.Then, 7 fuzzy sets for absolute errors can be defined as follows: In the literature, there are no general rules to determine parameter values of particle swarm optimization.Parameter values for this method have been generally specified intuitively due to the data in most of the applications.Therefore, the parameters of particle swarm optimization used in this study were intuitively determined like in other studies available in the literature (Ma et al., 2006).The parameters of the modified particle swarm optimization are determined according to trial and error method.The different values of the parameters were employed.The best parameters of the modified particle swarm optimization are determined as follows: ( 1 ,  1 ) = (2, 3), ( 2 ,  2 ) = (2, 3), ( 1 ,  1 ) = (1, 2),  = 30, and max  = 200.The optimal  1 and  2 matrices obtained from our methods are given later.The fuzzy and defuzzified forecasts of the proposed method are given in Table 5: .0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.4540 0.0000 0.7525 0.0000 0.0458 0.0000 0.0000 0.0000 0.3374 1.0000 1.0000 1.0000 1.0000 0.7000 1.0000 1.0000 0.0000 1.0000 1.0000 0.8283 0.0000 0.0000 0.4271 0.0000 1.0000 0.3627 0.0000 0.0000 1.0000 0.0000 1.0000 1.0000 1.0000 0.1443 1.0000 0.4579 0.0000 0.0000 1.0000 0.0000 1.0000 0.4002 0.0000 The proposed method 85089 0000 1.0000 0.1475 0.1273 0.5824 1.0000 1.0000 1.0000 0.2359 1.0000 0.6483 0.4169 0.2619 1.0000 0.7084 0.3526 1.0000 1.0000 0.0000 1.0000 1.0000 0.5122 1.0000 0.7212 0.0000 1.0000 1.0000 0.8676 0.0000 1.0000 1.0000 0.0000 0.1144 0.0000 0.0000 0.0000 0.0000 0.2108 0.0000 0.9019 0.0000 1.0000 0.3198 0.0000 0.1446 0.0000 0.0000 0.0000 0.6737 The mean square error (MSE) value of the proposed method is given in Table 6.For comparison, the forecasting results of some other well-known fuzzy time series forecasting approaches are also presented in Table 6.When Table 6 was examined, it is seen that the proposed method gives the most accurate forecasts in terms of MSE criterion.
Secondly, to examine the forecasting performance of the proposed approach on test set, it was also applied to four different data sets.The first data set is index 100 in stocks and bonds exchange market of İstanbul (ISBEMI) whose daily observations are between May 20, 2008 andSeptember 29, 2008.This data is called ISBEMI Set 1.The second data is index 100 in stocks and bonds exchange market of İstanbul (ISBEMI) whose daily observations are between October 3, 2008 and December 31, 2008.The second data is called ISBEMI Set 2. The third one is ISBEMI Set 3, and its daily observations are from October 01, 2009 to December 31 2009.ISBEMI Set 1, ISBEMI Set 2, and ISBEMI Set 3 have 95, 59, and 63 observations, respectively.For comparison, these time series were also forecasted using other well-known fuzzy time series method proposed by Song and Chissom [2], Chen [4], Chen [6], Huarng [5], Huarng and Yu [11], and Aladag et al. [29].In the implementation, the last 15 of ESBEMI Set 1, 7 observations of ISBEMI Set 2, and 7 observations and 15 observations of ISBEMI Set 3 were used for test sets.
It is a well-known fact that there are no general rules to determine the parameters such as the length of interval or the number of fuzzy sets for some methods used in the implementation.Thus, related parameters were determined using trial and error method like in other studies available in the literature.To find the best length of interval for the methods introduced in [4], Chen [6], and Aladag et al. [29], the values between 200 and 2500 were examined with increment 100.The interval lengths in the methods given in Huarng [5] and Huarng and Yu [11] were determined when the methods were progressing because of the nature of these methods.The number of fuzzy sets was experienced between 5 and 25 for the proposed approach and Song and Chissom [2].When the methods proposed by Chen [5] and Aladag et al. [29] were employed, models order 2, 3, 4, and 5 were examined since time series have daily observations.Finally, the number of neurons in hidden layer was changed from 1 to 10 when the approach proposed by Aladag et al. [29] was applied.After practicing, the forecasts obtained from the case where we have got the best result for the test data and the error criteria related to those forecasts are presented in Tables 7 and 8 for ISBEMI Sets 1 and 2, respectively.
For Table 7, the cases in which the superlative results were obtained are (i) the number of the fuzzy sets was 12 for the Song-Chissom method [2]; (ii) the interval length was 1200 for the Chen method [4]; (iii) the interval length was 800 for the Huarng distribution-based method [5]; (iv) the interval length was 200 for the Huarng averagebased method [5]; (v) the ratio sample percentile was 0.5 for the Huarng and Yu ratio-based method [11]; (vi) the number of the fuzzy sets was five for the Cheng et al. method [19]; (vii) the number of the fuzzy sets was 11 and the number of the neurons in the hidden layer was five for Yolcu et al. [32]; (viii) the number of the fuzzy sets was 5 in the proposed method.
For Table 8, the cases in which the superlative results were obtained are (i) when the number of fuzzy sets is 12, for the method of Song and Chissom [2]; (ii) when the length of interval is 1900 for the method of [4]; (iii) when the length of interval is 2200 and the model order is 2 for the method of Chen [6]; (iv) when the length of interval is 800 for the method of Huarng distribution based [5]; (v) when the length of interval is 200 for the method of Huarng average based [5]; (vi) when the ratio sample percentile is 0.5 for the method of Huarng and Yu ratio based [11]; (vii) when the length of interval is 1000, the model order is 2, and number of neurons in hidden layer is 7 for the method proposed by Aladag et al. [29]; (viii) when the number of fuzzy sets is 5 in the proposed method.The best cases for the results given in Table 9 resulted when (i) Song-Chissom method [2] was applied with nine fuzzy sets; (ii) interval length was 1300 for the Chen method [4]; (iii) interval length was 800 for the Huarng distributionbased method [5]; (iv) interval length was 200 for the Huarng average-based method [5]; (v) ratio sample percentile was 0.50 for the Huarng and Yu ratio-based method [11]; (vi) the number of the fuzzy sets was 13 and the number of the neurons in the hidden layer was seven for Yolcu et al. [32]; (vii) when the number of fuzzy sets is 5 in the proposed method.
The best cases for the results given in Table 10 resulted when (i) Song-Chissom [2] method was applied with nine fuzzy sets; (ii) interval length was 1500 for the Chen method [4]; (iii) interval length was 800 for the Huarng distributionbased method [5]; (iv) interval length was 200 for the Huarng average-based method [5]; (v) ratio sample percentile was 0.50 for the Huarng and Yu ratio-based method [11]; (vi) the number of the fuzzy sets was seven and the number of the neurons in the hidden layer was three for Yolcu et al. [32]; (vii) the number of the fuzzy sets was five in the proposed method.
When Tables 6-10 are examined, it is obvious that the most accurate forecasts are obtained for all data sets when the proposed forecasting approach is used.

Conclusion and Discussion
In this study, an ARMA type fuzzy time series forecasting model is firstly introduced in the literature.In addition, to analyze this model, a novel approach based on fuzzy -means and particle swarm optimization methods is suggested in this study.To show the forecasting performance of the proposed forecasting approach, the enrollments of Alabama University are applied to a well-known data and obtained forecasting results are compared to those produced by other fuzzy time series forecasting methods available in the literature.In addition, to evaluate the forecasting performance of the proposed forecasting approach on test set, two different time series are forecasted by utilizing the proposed method and other approaches available in the literature.As a result of the comparison, it is observed that the proposed approach gives the most accurate forecasts.
In the literature, the proposed method is the first method based on a fuzzy time series forecasting model which contains both AR and MA terms.In the linear stochastic ARMA models, the AR(∞) model is equal to MA(1) model.But this result is only valid for linear ARMA models.There is no such is employed in the fuzzification process, the number of intervals, the interval length, and the degrees of memberships are not arbitrarily determined, so the proposed approach works in a systematic way.In addition, information loss which occurs in the defuzzification stage is avoided by using a method which takes all membership degrees into account instead of using centroid method.Therefore, the proposed method has high forecasting accuracy.Moreover, long-range dependence was determined for all time series by using / tests, and results are given in the appendix.All of the time series are short-range dependent.In the future studies, the performance of the proposed method will be able to research for long-range time series.But in the proposed fuzzy time series method, any linear model was not used.Fuzzy time series methods can be used to forecast nonlinear time series.In the fuzzy time series methods, the relations are employed instead of functions.
The proposed forecasting approach is suggested to forecast a first-order model including both AR and MA terms.In future studies, high-order models can be defined.Then, it is possible to extend the proposed approach to forecast high order models.
The fundamental definitions of fuzzy time series, time variant, and time invariant fuzzy time series definitions are presented in what follows.Definition 1.Let () ( = . . ., 0, 1, 2, . ..), a subset of real numbers, be the universe of discourse on which fuzzy sets   () are defined.If () is a collection of  1 (),  2 (), . .., then () is called a fuzzy time series defined on ().

Table 1 :
An example time series   .

Table 2 :
The enrollment data.

Table 3 :
The centers of clusters obtained from FCM algorithm in Step 1.

Table 4 :
The memberships of the observations obtained from FCM algorithm in Step 1.

Table 5 :
The forecasts produced by the proposed methods.

Table 6 :
The obtained results for the enrollment data.

Table 7 :
The obtained results for ISBEMI Set 1.

Table 8 :
The obtained results for ISBEMI Set 2.

Table 9 :
The obtained results (last 7 observations are test set) for ISBEMI Set 3.

Table 10 :
The obtained results (last 15 observations are test set) for ISBEMI Set 3.

Table 11 :
Results of R/S test. of result for fuzzy time series model, artificial neural network model, and nonlinear stochastic models.Because of this, adding MA terms to fuzzy time series forecasting model is really important.Also, in the proposed method, particle swarm optimization method is used to calculate membership values in the fuzzy relationship matrix, so information loss and decrease in explanatory power of the model are prevented.Besides, since fuzzy -means clustering method kind