^{1}

^{2}

^{3}

^{1}

^{2}

^{1}

^{2}

^{3}

Forecasting the tendencies of time series is a challenging task which gives better understanding. The purpose of this paper is to present the hybrid model of support vector regression associated with Autoregressive Integrated Moving Average which is formulated by hybrid methodology. The proposed model is more convenient for practical usage. The tendencies modeling of time series for Thailand’s south insurgency is of interest in this research article. The empirical results using the time series of monthly number of deaths, injuries, and incidents for Thailand’s south insurgency indicate that the proposed hybrid model is an effective way to construct an estimated hybrid model which is better than the classical time series model or support vector regression. The best forecast accuracy is performed by using mean square error.

Time series modeling and forecasting are a challenge for describing dynamic phenomena and pattern behavior of the time series. In recent years, the issue of accurate Thailand’s south insurgency trends has been receiving more attention. There are many research papers that studied the unrest in southern Thailand. According to the database of Deep South Watch [

In this study, we would like to identify patterns and trends of Thailand’s south insurgency and to evaluate the accuracy of model for modeling and forecasting. By doing this, we use the traditional regression models such as Autoregressive (AR), Moving Average (MA), Autoregressive Moving Average (ARMA), and Autoregressive Integrated Moving Average (ARIMA). These models are also called the Box-Jenkins models.

In general, time series data of Thailand’s south insurgency can be categorized as nonstationary time by using Box-Jenkins methodology. Then an estimated model of time series data of Thailand’s south insurgency can be obtained by support vector regression (SVR). We aim to combine ARIMA and SVR for making an adequately estimated model in order to forecast time series of Thailand’s south insurgency.

This paper is organized as follows. Section

Three basic methods for forecasting time series are naïve model, exponential smoothing model, and ARIMA model. The first two models relate to a random walk as the formulation of the model. In this section, ARIMA model will be reviewed.

An autoregressive model of order

The Moving Average model of order

Autoregressive Moving Average model abbreviated as ARMA(

According to the original Box-Jenkins methodology, an integrated process is the stationary process obtained by differenced a nonstationary process. The stationary ARMA(

Plots of autocorrelation function (acf) and partial autocorrelation function (pacf) are the main tools in order to identify parameters for AR, MA, ARMA, and ARIMA models. AR(

Opposite to AR(

A mixed process ARMA(

This identification as described in this section will be important to diagnose a model of our study.

In recent years, the forecasting model used in the literature can be classified into three categories: statistical models, artificial intelligence model (AI), and hybrid model.

Statistical models are known as time series models including naïve model, AR model, MA model, ARIMA model, exponential smoothing, and generalized autoregressive conditional heteroskedasticity (GARCH) volatility which aim to utilize time series analysis to identify the pattern of time series and provide the future value based on the obtained pattern.

ARIMA model is known as Box-Jenkins model [

AI models are the second kinds of forecast time series, practically artificial neural networks (ANNs), genetic algorithm (GA), and supported vector machine (SVM). AI models can capture nonlinear pattern and improved forecast performance.

Many of the literatures introduce a hybrid model in order to capture the linear and nonlinear characteristics in time series. Wang et al. [

A hybrid model is described by a combination of models with mixed methodology for formulation. Many literatures suggested that time series consists of linear

An estimated model of (

Zhang [

Let the dot product pace

The regression problem is to find the best approximate model

The regression problem is classified as linear or nonlinear type. For the linear regression model, the best approximate model

Generally, in order to describe nonlinear relationship between input and output, the SVR allied

Performing SVR to fit linear regression

The following two propositions related to the formulation of an estimated model. These propositions are modified from [

Given a regression training set

The constant

Given a regression training set

The parameter

The optimal regression model is obtained by substituting

The optimal regression model is

In this section, we want to formulate the proposed model. We begin by using the hybrid models that combine several models in order to reduce the risk of using an inappropriate model, obtain the results that are more accurate than the previous one, and improve overall forecasting performance.

Assume that

Consider a time set

The under-study time series

In this research, we are interested in studying the unrest in the four southern provinces of Thailand, particularly in Pattani, Yala, Narathiwat, and parts of Songkla. We consider the monthly number of deaths, injuries, and incidents in these provinces. At the time of working research, we can get the latest data from Deep South Watch (DSW) [

Figure

Number of unrest incidents in the four southern provinces of Thailand (Pattani, Yala, Narathiwat, and Songkla) from 2005 to 2015.

The data series of our study consists of 40 months of deaths, injuries, and incidents in the four southern provinces of Thailand from September 2012 to December 2015.

Figure

Monthly number of deaths, injuries, and incidents for unrest in the four southern provinces of Thailand.

From Figure

Monthly numbers of injures and incidents are apparently stationary. A candidate model for monthly number of two data series can be determined by plotting of acf and pacf. However, the monthly number of deaths exhibits a linear trend in the mean since it has a clear downward slope.

Figure

(a) Monthly number of deaths is plotted against its first differenced series, acf (b) and pacf (c) plots for the first difference in monthly number of deaths.

(a) Monthly number of injuries is plotted against its first differenced series, acf (b) and pacf (c) plots for the first difference in monthly number of injuries.

(a) Monthly number of incidents is plotted against its first differenced series, acf (b) and pacf (c) plots for its first difference in monthly number of incidents.

Plotting of the first differenced series (Figures

The acf for the first difference in monthly number of deaths tends to die down quickly whereas the pacf tends to show spike for lags up to 1 which ignores significant spikes in each plot when it is outside the limits. This suggests that the first difference in monthly number of deaths can be a model as an AR

Similarly, the first differenced series of injures and incidents can be a model as an AR

Table

Some reports of mean square error for fitting and forecasting the series.

Time series | Model | ||
---|---|---|---|

ARIMA |
SVR | Hybrid | |

Deaths series | 9.4383 | 7.0882 | 0.7922 |

Injuries series | 28.0352 | 20.4161 | 0.9921 |

Incident series | 41.8077 | 31.7669 | 1.469 |

Plotting a convergent of mean square error is calculated from monthly number and an estimated model with 2,500, 5,000

Fitting performance for monthly number of deaths (a) and injuries (b) with ARIMA

Setting

Predictive performance of SVR-ARIMA

The actual, fitted, and forecasted series by hybrid model for series of deaths.

In the same way, for monthly number of injures, setting

Predictive performance of SVR-ARIMA

The actual, fitted, and forecasted series by hybrid model for series of injuries.

For monthly number of incidents, set

Predictive performance of SVR-ARIMA

The actual, fitted, and forecasted series by hybrid model for number of incidents.

The hybrid SVR-ARIMA model has been investigated to formulate time series model of monthly number of Thailand’s south insurgency in this study. In particular, we consider the first difference in monthly number of deaths, injuries, and incidents in Pattani, Yala, Narathiwat, and Songkla provinces in 40 months from September 2012 to December 2015. According to the hybrid methodology, the SVR-ARIMA(

The test results of the estimated model are obtained from the proposed hybrid model and compared with the estimated model of the AR(

The authors declare that there are no conflicts of interest regarding the publication of this paper.

The authors gratefully acknowledge the Deep South Coordination Center (DSCC) and Deep South Watch (DSW) for providing the data. This research was supported by grant funds from the Centre of Excellence in Mathematics, the Commission on Higher Education, Thailand.