Average-Based Fuzzy Time Series Markov Chain Based on Frequency Density Partitioning

. Fuzzy time series (FTS) is one of the forecasting methods that has been developed until now. The fuzzy time series is a forecasting method that uses the concept of fuzzy logic, which Song and Chissom ﬁ rst introduced. The fuzzy time series (FTS) Markov chain uses the Markov chain in defuzzi ﬁ cation. The determination of the length of the interval in the fuzzy time series plays an important role in forming a fuzzy logic relationship (FLR), and this FLR will be used to determine the forecasting value. One method that can be used to determine the interval length is average-based. However, several studies use partitioning based on frequency density to obtain the optimal interval length to get better forecasting accuracy. This study combines the fuzzy time series Markov chain, Average-based fuzzy time series, and Fuzzy time series based on frequency density partitioning to become average-based fuzzy time series Markov chain based on the Frequency Density Partition which conducts redivided intervals based on frequency density in the average-based fuzzy time series Markov chain method. This method is implemented in forecasting the Indonesian Islamic stock index (ISSI) for the selected period. The calculation of the accuracy level using the mean square error (MSE) and the mean average percentage error (MAPE) shows that the fuzzy Markov chain-based fuzzy time series based on the frequency density partition has a high level of accuracy in forecasting.


Introduction
A time series is defined as a collection of observations or observations made sequentially over time.Usually, observations in time series are not independent or can be said to be correlated.Thus, the order of the observations becomes important.This results in statistical procedures and techniques based on independent assumptions being no longer valid; thus, different methods and approaches are needed.Time series analysis aims for depiction, exposure, prediction, and monitoring [1].
Forecasting predicts a variable's values based on the known values of that variable or related variables [2].The rationale for the time series is that current observations depend on previous observations.Therefore, many types of forecasting use time series data, including the fuzzy time series method, smoothing, average, and moving average, and others.
The forecasting method using the fuzzy logic concept, hereinafter known as fuzzy time series, was first proposed by Song and Chissom in 1993.Song and Chissom used time-invariant and time-variant methods in forecasting.As a result, several fuzzy time series (FTS) methods have been developed, including Chen [3,4], Chen and Hsu [5], weighted [6], backpropagation [7], multiple-attribute [8], percentage change [9], and the Markov chain [10].
The Markov chain is a stochastic process where future events only depend on today's events and do not depend on past conditions.The Markov chain is defined by a transition opportunity matrix that contains information regulating the system's movement from one state to another [10].In fuzzy time series, the Markov chain is used in the defuzzification stage [10].Defuzzification is the calculation step of fuzzy time series forecasting based on a fuzzy logical relationship group (FLRG).In FLRG, there is a relationship between the current and next states.The current state is the value that will be calculated as the predicted value, and the next state is the data used to get the value in the current state.Therefore, the relationship between the current state and the next state in the FLRG is considered a conditional process in line with the basic principles of the Markov chain method [10].The Markov chain is used in several fields, one of which is the research of Indriyani and Pratiwi, et al. [11], which uses the Markov chain in the measles spread pattern.The Markov chain method is also used by Prasetya and Ferdian [12] in scheduling Oerlikon machine maintenance to optimize maintenance costs and time.The results show that calculations with the Markov chains produce more optimal time and costs.
Ruey-Chyn Tsaur conducted a study by combining the fuzzy time series method with the Markov chain concept to predict the Taiwan currency exchange rate against the dollar.The method used is known as the fuzzy time series Markov chain.The results obtained are the fuzzy time series Markov chain that has a better level of accuracy than the fuzzy time series [10].Many studies have used the FTS-Markov chain including Dinatha et al. [13] who used the FTS-Markov chain to predict export profits.In addition, Hidayah and Sugiman [14] and Mangkunegara and Yerizon [15] both use the FTS-Markov chain to forecast exchange rates.Their research [13][14][15] produced a small error value, but in determining the FLR, there are still shortcomings and will produce different forecasting results because the length of the interval is determined according to the perception of each researcher.
In the fuzzy time series method, the determination of the interval length does not have a definite formula in its calculation; the interval is formed depending on the researcher [16], even though the determination of the length of the interval is very influential on the formation of a fuzzy logical relationship (FLR) which will result in differences in the results of forecasting calculations [17].
One method that can be used to determine the length of the interval is the average-based model which was introduced by Xihao and Yimin [17].This average-based fuzzy time series uses an average-based method in determining the length of the interval.Research by Xihao and Yimin [17] also shows that the use of the average-based fuzzy time series method produces better forecasts than Chen's fuzzy time series method.Wuryanto and Puspita [18] and Kumar N. and Kumar H. [19] also used the average-based FTS model to predict the development of confirmed cases of COVID-19.Research [18,19] has a small error value, but the interval length is still less than optimal because each class interval has the same interval length regardless of the frequency in each class.
Furthermore, Chen and Hsu [5] developed Chen's fuzzy time series, by repartitioning based on frequency density to predict the number of applicants at the University of Alabama.Chen and Hsu redivided intervals before the fuzzification process.Chen and Hsu's research shows that after the redivided interval, the fuzzy time series gets a better accuracy value than other existing fuzzy time series.Irawanto et al. [20] used frequency density-based partitioning for stock index forecasting.Wulandari et al. [21] used frequency density partitioning for forecasting the production of petroleum which resulted in a small error value.
Based on the description above, the researcher is interested in examining the average-based fuzzy time series Markov chain based on frequency density partitioning (FDP).Because based on the introduction above that the researchers have read, there has been no research that has conducted redivided intervals based on frequency density in the average-based fuzzy time series Markov chain method, and because frequency density partitioning produces subintervals, which is based on empirical analysis, these subintervals cause the fuzzy numbers to get closer to the crisp value.This idea is explained in the following chart in Figure 1.
This method is applied to forecast the Indonesian Islamic stock index (ISSI).Furthermore, to see the level of accuracy of the method, the mean square error (MSE) and the mean average percentage error (MAPE) are used.Then, for comparison, the researcher used Chen's fuzzy time series.

Basic Knowledge
2.1.Fuzzy Time Series.The definition of fuzzy time series was first introduced by Song and Chissom [16].Let U be the universe of discourse, with U = fu 1 , u 2 , ⋯, u n g on a fuzzy set A i defined as where f A is the membership of the fuzzy set A i , u k is an element of the fuzzy set A i , and f A ðu k Þ shows the degree of membership of u k in A i , where k = 1, 2, 3, ⋯, n.
Definition 2. If FðtÞ is caused by Fðt -1Þ, then the relation in the first order rmodel FðtÞ can be stated as follows [3]: where "○" is the Max-Min composition operator and Rðt, t − 1Þ is a relation matrix to describe the fuzzy relationship between Fðt − 1Þ and FðtÞ.

Average-Based Algorithm.
Average-based algorithm is an algorithm that can be used to set the interval length that is determined at the initial stage of forecasting when using fuzzy time series.The steps of the average-based algorithm are as follows [17]:  The Markov chain analysis is a method that studies the properties of the past to estimate the properties of these variables in the future.Conceptually, the Markov chain can be described by assuming fX n , n = 0, 1, 2, ⋯g as a finite stochastic process or the probability value can be calculated.The set of probability values of the stochastic process is denoted by the set of positive integers f0, 1, 2, ⋯g.
If X n = i, then this process occurs in i when n, assuming that whenever this process occurs in state i, on a point of probability P ij who will move to the state j.Thus, it can be written as follows: For all states i 0 , i 1 , ⋯, i n−1 , i, j, n ≥ 0. This process is called the Markov chain.
The above equation is interpreted in the Markov chain as a conditional distribution of the future state X n+1 obtained from the previous state X 0 , X 1 , ⋯, X n−1 and the current state X n and does not depend on the previous state but depends on current state.
The value of P ij represents the probability of the transition process from i to j.Because the probability value is always positive and the transition process moves, then P ij ≥ 0 and i, j ≥ 0, sum P ij = 1, j = 1, ⋯, ∞, and i = 0, 1, ⋯ Let P be the transition probability matrix P ij ; then, it can be denoted in the following equation [10]  3

Journal of Applied Mathematics
The Markov chain process in the fuzzy time series used is a transition probability matrix.The transition probability matrix is used as the basis for forecasting calculations.The probability from the current state to the next state is obtained from the FLRG.State transition probabilities are written as follows [10]: where P ij is the transition probability of state A i to state A j one step, M ij is the number of transitions from state A i to state A j one step, and M i is the amount of data included in the state A i .
The probability matrix P of all states is dimension n × n, with n being the number of fuzzy sets, and can be written as follows [10]: where Y t is the actual data period t, F′ t is the t period forecasting value, and n is the predictable amount of data.

Data Collection.
The data used in the study is the weekly data on the Indonesian Sharia Stock Price Index (ISSI) for the period June 2019-May 2021, which was obtained from the http://yahoo.finance.comsite.The results of the forecasting test are then validated using the MSE and MAPE values.Furthermore, it is compared with Chen's fuzzy time series.
where u i ði = 1, 2, ⋯, nÞ is an element of the universe of discourse (U) (g) Fuzzification of historical data (h) Define FLR (i) Define FLRG (j) Determine the transition probability with the formula: (k) Calculate the initial forecast value using a probability matrix with the following rules: Rule 1: if the FLRG A i is one to one (e.g., A i ⟶ A k , where P ik = 1 and P ij = 0, j ≠ k), the forecast FðtÞ is m k which is the middle value of u k with the following equation: Rule 2: if the FLRG A i is one too many (e.g., A j ⟶ A 1 , A 2 , ⋯, A n .j = 1, 2, ⋯, n), when Yðt − 1Þ at time ðt − 1Þ is included in state A j , then the forecasting FðtÞ is where m 1 , m 2 , ⋯, m j−1 , m j+1 , ⋯, m n is the middle value u 1 , u 2 , ⋯, u j−1 , u j+1 , ⋯, u n and Yðt − 1Þ are state values A j at time t − 1 Rule 3: if the FLRG A i is empty (A i ⟶ ∅), forecast value FðtÞ is m i which is the middle value of u i with the following equation: (l) Adjusting the trend of forecasting values with the following rules: (i) If state A i communicates with A i , starting from state A i at time t − 1 expressed as Fðt − 1Þ = A i and undergoing an increasing transition to state A j at the time t where (i < j), then the adjustment value is where l is the basis interval (ii) If state A i communicates with A i , starting from state A i at the time t − 1 expressed as Fðt − 1Þ = A i and experiencing a decreasing transition to state A j at the time t, where (i > j), the adjustment value is (iii) If state A i at the time t − 1 is expressed as Fðt − 1Þ = A i and undergoes a jump forward transition to state A i+s at the time t, where (1 ≤ s ≤ n − i), then the adjustment value is where s is the number of forwarding jumps (iv) If state A i at the time t − 1 is as Fðt − 1Þ = A i and undergoes a jump-backward transition to state A i−v at the time t, where (1 ≤ v ≤ i), then the adjustment value is where v is the number of jumps backward (m) Determine the final forecast value based on the adjustment of the trend of the forecasting value.
If FLRG A i is one to many and state A i+1 can be accessed from state A i , where state A i is related to A i , then the forecasting result becomes ' ðtÞ = FðtÞ + D t1 + D t2 = FðtÞ + ðl/2Þ + ðl/2Þ.If FLRG A i is one to many and state A i+1 can be accessed from A i , where state A i is not related to A i , then the forecasting values become F ' ðtÞ = FðtÞ + D t2 = FðtÞ + ðl/2Þ: If FLRG A i is one to many and state A i−2 can be accessed from state A i , where A i is not related to A i , then the forecasting result is F ' ðtÞ = FðtÞ − D t2 = FðtÞ − ðl/2Þ × 2 = FðtÞ -l: If v is a jump step, the general form of the forecast is

Results and Discussion
Average-based fuzzy time series Markov chain based on frequency density partitioning was tested on the Indonesian Sharia Stock Price Index (ISSI) forecasting to see whether this method could optimize the interval on the FLR to produce more accurate forecasts.
In forecasting using an average-based fuzzy time series Markov chain based on frequency density partitioning, the first step is to collect ISSI historical data obtained as many as 104 data.The data is used to determine the universe of discourse.Furthermore, dividing into several subsets using an average based is as follows: Journal of Applied Mathematics as an example, data to 1 is as follows: The calculation above is also used for the second data and so on; the total difference from the data is 304.62.Furthermore, the difference in the data is calculated on average using the equation ∑jD n+1 − D n j/lots of data = 304:62/103, so the obtained average absolute difference is 2.96.
Table 2 shows the mean value of each interval that has been repartitioned based on frequency density.This middle value is used to calculate the initial forecast value on the fuzzy time series.The following is an example of calculating the mean value u 1 ðm 1 Þ: , Furthermore, defining fuzzy sets, fuzzy sets that can be formed from the universe of conversation are 53 fuzzy sets.Based on equation ( 8), the fuzzy set formed is as follows: The next step is to perform fuzzification; the data from the fuzzification results are presented in Table 3.
Table 3 shows the results of ISSI weekly data fuzzification; fuzzification is performed to convert firm values into fuzzy values.An example of the fuzzification process for data on June 9, 2019 (t = 1), is 182.76 entered in the interval u 60,2 = ½182:5, 183.Next, the formed fuzzy set u 60,2 has a membership degree of 1 when it is in the fuzzy set A 38 , so that for the 9 June 2019 data, the fuzzified data obtained is A 38 .
After fuzzification is obtained, the next step is to determine FLR and FLRG, presented in Tables 4 and 5.
Table 4 shows that data 1 is fuzzified at A 38 and the second data is fuzzified at A 39 so that the FLR is A 38 ⟶ A 39 .FLR plays an important role because it is FLR that is used to determine forecasting values.

6
Journal of Applied Mathematics Based on Table 5, all FLR formed in Table 4 are grouped into interconnected FLRG, example, FLRG on A 3 , where A 3 is the current state and has a relationship to A 3 ⟶ A 4 and A 3 ⟶ A 9 .These 2 FLR are grouped into 1 FLRG, namely, A 3 ⟶ A 4 , A 9 .
Table 6 shows the initial forecasting values for the period 16 June 2019 to 30 May 2021.These initial forecasting values were obtained from the defuzzification results of the FLRG group.
The next step is to adjust the forecasting trend.For example, the adjustment value for June 16, 2019, the next state is A 39 , and the current state is A 38 ; then, the adjustment calculation uses the forecast adjustment rule point c with equation ( 19) D t2 = ðl/2Þs = ð0:5/2Þ1 = 0:25.For the   18), (19), and (20).
the adjustment value is obtained, then the final forecast value.For the calculation of the adjusted forecast value, follow the existing rules in equation (21).For example, calculations for adjusted forecast values F ′ 2 = F 2 ± D t2 = 180:17 + 0:25 = 180:42; by doing the same way, the summary of the final forecasting results is shown in Table 7.
Table 7 shows the results of the final forecast that has made some adjustments.This final forecasting value is obtained from the sum of the initial forecasting value with the adjustment value.The initial forecasting value that has made several adjustments produces a final forecasting value that is closer to the actual data.The comparison of the final forecasting value with actual data can be seen in Figure 2.
Figure 2 shows a comparison of forecasting results using an average-based Markov Chain based on frequency density partitioning.The graph in blue shows the actual data, and the graph in orange shows the results of forecasting using an average-based fuzzy time series Markov chain based on frequency density partitions.Average-based fuzzy time series Markov chain based on frequency density partitioning shows a pattern that is almost the same as the actual data, although the resulting forecasting value is not the same as the actual data; the pattern of forecasting values uses the average-based fuzzy time series Markov chain based on density partitioning the frequency that follows the pattern of the actual data.
The last step is to calculate the forecast accuracy value using MSE and MAPE.For the MSE and MAPE values, this calculation uses formulas ( 9) and (10), respectively; the average-based fuzzy time series method based on FDP is 5.76 and 1.04%.This shows good forecasting performance on the average-based fuzzy time series Markov chain based on frequency density partitioning as shown in Figure 2, where the forecasting value is closer to the actual value.
The good accuracy value on the fuzzy time series Markov chain is obtained due to the use of frequency density partitions which produce subintervals so that the fuzzy values can be close to the crisp values.This is also supported by the use of average-based method to determine the optimal interval length.

Conclusions
Forecasting using the average-based fuzzy time series Markov chain based on the frequency density partition (FDP) has a good accuracy value; this can be seen from the MSE and MAPE values of 5.76 and 1.04%, respectively.This is a good accuracy value because the Markov chain fuzzy time series uses an average-based method to determine the length of the interval so that the length of the interval used is not just the perception of the researcher.The length of this interval is then partitioned based on frequency density to obtain a more optimal interval length.Determination of the length of the interval on the FTS-Markov chain plays an important role in forming a fuzzy logical relationship (FLR), and this FLR is used to determine the forecast value.
(a) Determine the absolute difference (lag) between data n + 1 and data n with the formula: 2 Journal of Applied Mathematics lag = Data n + 1 -Data n j j : ð3Þ (b) Determine the length of the interval Length of interval = total lag numbers of data : 2: ð4Þ (c) Based on the interval length obtained from step (b), determine the basis value of the interval length according to

Figure 1 :
Figure 1: Flow chart idea of average-based fuzzy time series Markov chain based on modified frequency density partitioning.

3. 2 .
Forecasting Method.This study uses an average-based fuzzy time series Markov chain based on frequency density partitioning.The difference between this study and the previous research lies in the use of density partitioning frequencies in the average fuzzy-based Markov time series chain which will make the fuzzy values closer to the crisp values, so that the forecasting values have a good accuracy value.The steps are as follows [5, 10, 17, 22]: (a) Define the universe of discourse (b) Divide the universe of discourse into intervals using the average-based method (c) Distribute all research data into intervals (d) Determine the frequency density (e) Perform redivided intervals based on frequency density (f) Define a fuzzy set.Let A 1, A 2 , ⋯, A n be a fuzzy set that has a linguistic value from a linguistic variable; the definition of fuzzy set A 1, A 2 , ⋯, A n in the universe of discourse U is as follows: (a) Determine the smallest values (D min ) and greatest value (D max ) D min = 123:78 and D max = 192:86 and the value of D 1 = 0:78 and D 2 = 0:14 so that it can be defined U = ½123:78 − 0:78, 192:86 + 0:14 = ½123, 193 (b) Calculate the absolute difference between the data D n and D n+1 using equations lagD n = jD n+1 − D n j;

Table 1 (
d) The length of the interval is then rounded up according to the interval basis table 2.3.Frequency Density Partition.Chen and Hsu [5] developed a fuzzy time series by redivided intervals based on frequency density.In his research, after partitioning the universe discourse into n intervals of equal length, subpartition the intervals of the same length based on frequency density.By rule, (a) the interval with the first densest frequency is divided into 4 subintervals (b) the interval with the second densest frequency is divided into 3 subintervals (c) the interval with the third densest frequency is divided into 2 subintervals 2.4.Markov Chain.Markov first introduced the Markov chain in 1906. :

Table 1 :
Basis mapping table.probability matrix of all of the FLRG with dimensions of n x n, where n is the number of fuzzy sets that can be written as follows: