Short-Term Air Quality Prediction Based on Fractional Grey Linear Regression and Support Vector Machine

To predict the daily air pollutants, the fractional multivariable model is established. *e hybrid model of the grey multivariable regression model with fractional order accumulation model (FGM(0, m)) and support vector regression model (SVR) is used to predict the air pollutants (PM10, PM2.5, and NO2) from December 31, 2018, to January 3, 2019, in Shijiazhuang and Chongqing. *e absolute percentage errors (APEs) are used to determine the weights of the FGM(0, m) and SVR. Meanwhile, the Holt– Winters model is used to predict the air quality pollutants for the same location and period.When themean absolute percent error (MAPE) is 0%–20%, it indicates that the model has good accuracy of fitting and prediction. *eMAPE of the hybrid model is less than 20%. It is shown that except for the PM2.5 concentration prediction in Shijiazhuang (13.7%), the MAPE between the forecasting and actual values of the three air pollutants in Shijiazhuang and Chongqing was less than 10%.


Introduction
According to the statistical data in China [1], the 338 cities had an average of 79.3% of days with good air quality (meet the air quality standard), which increased to 1.3% compared with 2017. e number of days with heavy pollution was 2.2 percent, which fell to 0.3% compared with 2017. e PM 2.5 concentration was 39 microgram/m 3 , which fell to 9.3% compared with 2017. e concentration of PM 10 was 71 microgram/m 3 , which fell to 5.3% compared with 2017. e air quality in China improved in 2018 on a whole, but only 121 of 338 cities meet air quality standards as shown in Table 1. When the concentration of the air pollutants (PM 10 , PM 2.5 , and NO 2 ) meets the standard, the air quality is regarded as good. Otherwise, the air quality will be regarded as poor. e 24-hour air pollutant standard implemented since 2016 in China is shown in Table 1 [2]. In addition, there were 822 days of severe pollution, 20 more than that of 2017. It indicates that the governance of air quality is still a problem that cannot be ignored.
In recent years, air quality has attracted more and more attention, and more and more research studies have been done on air quality. e impact of foreign direct investment and research as well as development on China's industrial CO 2 emission reduction has been studied and its trend has been predicted [3]. A seasonal stacked autoencoder model combining seasonal analysis and deep feature learning was proposed for forecasting the hourly PM 2.5 concentration in Beijing [4]. An integrated short-duration memory neural network was proposed for the prediction of hourly PM 2.5 concentration in Beijing [5]. e trend of the observational PM 10 concentrations in Shimla city, India, was analyzed [6]. e predictive models can be divided into two categories (single model and the hybrid model). Some scholars used a single model to study air quality, and the multigene genetic programming was used to predict the concentrations of PM 10 [7]. e grey Markov model was used to predict the concentration of air pollutants in Pingdingshan [8]. A single dependent variable partial least squares regression was used to predict PM 2.5 real-time concentration in Beijing [9]. e grey Holt-Winters Model was used to predict the air quality indexes of Shijiazhuang and Handan [10]. A microscale land use regression model was used to predict NO 2 concentrations at a heavy trafficked suburban area in Auckland, NZ [11]. e seasonal grey one variable model with fractional order accumulation was used to predict air quality indexes of Xingtai and Handan [12]. e optimized particle swarm was used to predict the concentration of air pollutants in Aburrá Valley, Colombia [13]. e empirical mode decomposition based on the multifractal detrended fluctuation analysis method was used to study the daily PM 2.5 concentration in Hong Kong [14]. A land use regression model was used to estimate annual and seasonal PM 1 , PM 2.5 , and PM 10 concentrations [15]. Many scholars combined two different models to study the air quality, and the hybrid model of the regression models and feedforward backpropagation models with principle component analysis was used to predict the daily PM 10 concentrations [16]. e mixed model of information gain and least absolute shrinkage was used to predict the air quality index [17]. A linear and an artificial neural network statistical model have been developed and validated and established to forecast the short-term PM 10 hourly concentrations in the city of Brescia (Italy) [18]. e mixed air quality assessment model was designed and applied to analyze the pollution sources of PM 2.5 [19]. A mixed forecasting model of daily air quality index considering air pollution factors in Beijing and Guilin was proposed [20]. A hybrid particle swarm optimizationsupport vector machine model based on clustering algorithm was used to forecast the short-term atmospheric pollutant concentration in Beijing [21]. A hybrid multiresolution multiobjective ensemble model was used to forecast the daily PM 2.5 concentrations [22]. e hybrid model of autoencoder with bidirectional long short-term memory neural networks was used to predict the PM 2.5 concentration [23].
In recent years, more and more hybrid models have been used by scholars to predict. However, few of them will use the hybrid model of the artificial intelligence algorithm and statistical algorithm and few scholars used the hybrid model to predict the air pollutants. It has been proved that the hybrid models with good prediction effect in M4 are the combination of artificial intelligence algorithm and statistical algorithm. In order to improve the prediction accuracy, a hybrid grey multivariable regression model with fractional order accumulation model [24] and support vector regression [25] model (FGM(0, m)-SVR) model is proposed to predict air pollutants (PM 2.5 , PM 10 , and NO 2 ) in this paper.
is paper is divided into five parts. In Section 2, the situation in Shijiazhuang and Chongqing is introduced. In Section 3, the hybrid model is introduced. In Sections 4 and 5, the process of calculation and the results of Shijiazhuang and Chongqing are shown, respectively. rough the analysis of the calculation results, some suggestions for the air quality in Shijiazhuang and Chongqing are given in Section 6. Meanwhile, the conclusions are summarized in Section 6.

Location in Shijiazhuang and Chongqing.
Shijiazhuang is the capital city of Hebei Province. It is located in the north China plain, which is adjacent to Beijing and Tianjin in the north, Bohai in the east, Taihang Mountain in the west, and the economic zone in the south. Shijiazhuang is the gate of the capital city, 273 kilometers away from Beijing. It is located between latitude 37°27′∼ 38°47′ and longitude 113°30′∼115°20′ (as shown in Figure 1). Shijiazhuang is one of the most polluted cities in China.
According to the statistics in 2018, Shijiazhuang ranked 168th among 169 cities with poor air quality in China.
Chongqing is an economic, financial, scientific, and technological innovation, shipping and commercial logistics center in the upper basins of the Yangtze River. It is located in the southwest of China's inland, Hubei and Hunan in the east, Guizhou in the south, Sichuan in the west, and Shanxi in the north. Chongqing is located longitude 105°17′∼110°11′ and latitude 28°10′∼32°13′ (as shown in Figure 1). Chongqing is also a heavily polluted city. Compared with previous years, the condition of air quality in Chongqing has improved significantly in 2019. But it is still a long way from the goal set by the Chongqing Ecology And Environment Bureau that ensures the number of days with good air quality in 2019 stays above 300. Chongqing has been known as "the city of fog" and the air quality ranks behind other cities in China.

Air Quality in Shijiazhuang and Chongqing.
e number of days with good air quality (up to the air quality standard) as shown in Table 2 is obtained from the website of Shijiazhuang Environmental Protection Bureau (http://www. sjzhb.gov.cn) and Chongqing Environmental Protection Bureau (http://www.cepb.gov.cn/), respectively. e days with good air quality in Shijiazhuang and Chongqing from 2014 to 2018 can be seen from Figure 2 clearly.
As shown in Figure 2, despite the increasing efforts of government governance, the number of days with bad air quality had been increasing since 2015. According to the statistical data of the Hebei Province Environment Protection Hall in 2018, the air quality of Shijiazhuang is the worst in Hebei province. In addition, according to the statistics of China Environment Network, the air quality of Shijiazhuang was the worst among the 11 cities when it was ranked by air quality composite index. However, the target of air quality has been proposed in the " ree-year Action Plan for Shijiazhuang City to Win the Blue Sky Defense War (2018-2020)," and the days with good air quality in Shijiazhuang will exceed 176 days in 2019. It is mentioned in the plan that Shijiazhuang will complete the targets of " e 13th Five-year for Economic and Social Development of the People's obligatory" for air environmental quality until 2020, the main atmospheric pollutants emissions will be cut, and the rank of air quality strives to exit from the last 10 in 169 cities of China. erefore, how to effectively predict the air quality is particularly important. e number of days with heavy pollution in Chongqing is increasing year by year. e number of days with good air quality in Chongqing reached 316 in 2018. At the same time, the arrangements for the environmental protection work of 2019 had been made in the teleconference of Chongqing environmental protection on January 21, 2019: ensuring that the number of days with good air quality remains above 300 and that the average annual concentration of fine particulate matter is kept within 40 micrograms per cubic meter. From 2016 to 2018, the number of days with good air quality in Chongqing was 301, 303, and 316, which is inseparable from the government's governance measures. If the air pollutant can be predicted more accurately, the air pollution control will be more effective.

The Construction of Model
By accumulating generation operators, the FGM(0, m) model can transform the data from nonlinear to linear. At the same time, the data can be mapped from nonlinear to linear by using the SVR model. e historical parallelism was existed in the data of the same period. By processing the data through FGM(0, m) and SVR, better accuracy of prediction will be achieved, so the FGM(0, m) model and SVR model are used to forecast the air pollutants in this paper. In China, the number of people who choose to travel during the New Year's Day is larger, and people's travel is also affected by the environment, so it is particularly important to predict the air pollutants more effectively. Meanwhile, taking the air pollutant (PM 10 ) in Shijiazhuang as an example, the Holt-Winters model is used to calculate the fitting and predictive values. e basis of the Holt-Winters model is that the time series with linear trend, seasonal change, and random fluctuation can be decomposed and combines with exponential smoothing method to establish a forecasting model. In recent years, the Holt-Winters model is often used to predict the seasonal data, and it was used to predict the air pollutants in Shijiazhuang and Handan [10].
e Holt-Winters model was used to predict the air pollutant concentration (PM 10 ) in Shijiazhuang, and the results are contrasted with the hybrid model in this paper.

e Model of FGM(0, m).
e GM(0, m) model has been widely used in recent years, and it was used to analyze the influence factor for the construction of the model in vocational high school [26]. e GM(0, m) model was used to analyze the consumer experience on small-and mediumsized enterprises in creative living industry [27]. In this paper, the hybrid model of FGM(0, m) and SVM is applied to predict air pollutants. e FGM(0, m) model is introduced as follows: is the system characteristics sequence.
are the sequences of related factors. en, is the p/q order accumulation generate operator.
is the p/q order accumulation sequence: erefore, the equation of the FGM(0, m) model is given by where a, b 2 , b 3 , · · · , b m are model parameters.

e Model of SVR.
e SVR model is a linear separable model based on kernel functions that converts linearly indivisible data into high-dimensional space. In this paper, SVR model is used to learn the historical data, and according to the learning results, the model will make predictions. For the training sample set: represents the output data of SVR. e function model of SVR is given by Among them, x (0) 1 represents predicted value, φ(•) represents a nonlinear mapping function, w represents the weight vector, and b is the bias.w and b give us the following formula: where C and ε represent the maximum error coefficients of the penalty coefficient and insensitive loss function, respectively. ζ + i and ζ − i represent the relaxation coefficients. n represents the sample size of input data. e weight vector w can be expressed as follows: where β * i and β i represent Lagrange coefficient, respectively. e mathematical model equation of SVR is given by where K(·) represents the kernel function for calculating the inner product of two input vectors in high-dimensional eigenvectors. e function of sigmoid is used in this paper.

e Basis of Model Weights.
rough the statistical interpretation of multiple prediction models, a robust shortterm forecast of wind power generation under uncertainty is proposed [28]. A new method to determine the weight of the hybrid model is proposed. e root mean square errors (RMSEs) of model are used to determine the weight of the models. e RMSE is used as a performance index, which indicates the accuracy of the forecasting models due to awareness of RMSE over large errors in prediction. In this paper, the APEs are used to determine weights according to the same principle. en, the results obtained by the method described above are summed up, which is the final result. e calculation process is as follows: where x FGM(0, m) is the forecasting value of the FGM(0, m) model and x SVR is the forecasting result of the SVR model.
x (0) 1 is the actual values. e weights of the two calculation results are divided, and then, the results are summed up: e APE is calculated between the final results and the actual values: e MAPE from December 20 to 30 in 2018 is taken as the fitting error. e MAPE from December 31, 2018, to January 3, 2019, is taken as the predictive error. e process of modeling is shown in Figure 3.

The Calculation Process and Results in Shijiazhuang
In order to verify the accuracy of the hybrid model, three air pollutants in Shijiazhuang are used (PM 10  Air quality has attracted more and more attention of the government in recent years. Since the meteorological indicators in the same historical period are similar, the meteorological conditions (favourable or unfavourable to pollutant dispersion) affect air quality, and similar meteorological condition generally affects air quality in a similar way. e governance measures and efforts of governments in the same region are not different from each other, and it accounts for that the air quality in the same region is also similar. e same period data (December 20 to January 3) from 2014 to 2017 are used as the independent variables (X (0) j ), and the data from December 20 to January 3, 2018, are used as the dependent variable (X (0) 1 ). Data of December 20 to December 30, 2018, are fitted and data of December 31, 2018, to January 3, 2019, are forecasted. e calculation process of PM 10 in Shijiazhuang is taken as an example, and the original data of PM 10 in Shijiazhuang is shown in Table 3

Start
Step 1: fitting and predicted values are calculated by using FGM (0, m) and SVM, respectively Step2: the results of the two models are given different weights c and g are optimal The fitting and predictive values are recorded N End Step 3: the results that given different weights are summed up Step 4: calculate the APE between the results of the hybrid model and thereal values Step us, erefore, the estimation equation of FGM(0, 4) is given by  Table 4.

Calculation Result of PM 10 by Using SVR Model.
In the research process of this paper, the firefly optimization algorithm and support vector regression model are established by using the toolkit in Matlab. e two parameters C and g of SVR method are optimized. In the process of selection, the optimal result can be obtained when the MAPE between the actual values and the fitting values is the smallest. In this paper, the optimal results of C and g are C � 3.2103 and g � 0.023268, respectively. e kernel function sigmoid is used for learning. e optimal calculation results are obtained after the data are operated 50 times.
SVR is used to calculate the fitting values from December 20, 2018, to December 30, 2018, and the values from December 31, 2018, to January 3, 2019, are forecasted. In this paper, the data from December 20 to December 30 in 2014-2017 were used as the training set. Meanwhile, the data from December 31 to January 3 of the following year in 2014-2017 were used as the testing set. e MAPE was used as the standard to measure the effect of fitting. Taking the concentration of PM 10 in Shijiazhuang as an example, the MAPE between the fitting values and the original data is 0.48%. It can be seen that SVR has a good fitting effect and can be used for the forecasting of air pollutants. e calculation results are shown in Table 5.
e MAPE of FGM(0, m) is 38.4% and 17.7%, respectively. e MAPE of SVR is 13.2% and 26.6%, respectively. However, the MAPE of the hybrid model is 12.3% and 4.4%, respectively. Although the fitting accuracy is improved slightly, the predictive accuracy is improved clearly. However, the MAPE of the hybrid model is relatively small, so the prediction effect of the hybrid model is better. e fitting and prediction accuracies of the hybrid model are 22.1% and 37.1% higher than those of the Holt-Winters model, respectively. It indicates that the fitting and prediction accuracies of the hybrid model are significantly higher than those of the Holt-Winters model. 2.5 . It can be seen from Table 7 that the MAPE is 14.1% and 13.7%, respectively. It can be seen that the MAPEs of the fitting values and prediction values of the hybrid model are all smaller, which indicates that the fitting accuracy and prediction accuracy are higher.

Calculation Results of NO 2 by Using the Hybrid Model.
According to the calculation results of Table 8, the MAPE for NO 2 of the hybrid model is 12.4% and 2.3%, respectively. e MAPE of the hybrid model is very small, and it indicates that both the fitting and prediction accuracies of the hybrid model are high.

Calculation Results of the Air Pollutants in Chongqing
Similarly, three pollutants of Chongqing (PM 10  e results are shown in Table 9.
e MAPE of fitting and forecasting is 22.6% and 1.9%, respectively. e MAPE of the forecasting values is small, which indicates that the prediction accuracy of the hybrid model is high.

Calculation Result of PM 2.5 by Using Hybrid
Model. It can be seen from Table 10 that, for the hybrid model of FGM(0, m) model and SVR model, the MAPE of fitting and forecasting is 9.5% and 3.4%, respectively. e MAPE of the fitting and forecasting values is small, and it can be seen that the precision of the hybrid model has been significantly improved.

Calculation Result of NO 2 by Using Hybrid Model.
e hybrid model of the FGM(0, m) model and SVR model is used to calculate the fitting values from December 20 to December 30, 2018, and the forecasting values from December 31, 2018, to January 3, 2019, respectively. As shown in Table 11, the fitting and prediction accuracies are 4.7% and 8.4%, respectively. e fitting and prediction errors of the hybrid model are smaller. It indicates that the fitting and prediction accuracies of the hybrid model are higher. e error level of fitting and prediction is shown in Table 12.
According to the criteria mentioned above, the prediction accuracy of two air pollutants (PM 10 and NO 2 ) in Shijiazhuang is 4.4% and 2.3%, respectively. e prediction accuracy of three air pollutants (PM 10 , PM 2.5 , and NO 2 ) in Chongqing is 1.9%, 3.4%, and 8.4%, respectively. Although the prediction accuracy of PM 2.5 is 13.7%, the prediction accuracy appears to be in the relatively superior range. It indicates that the prediction accuracy of the hybrid model is relatively high.

Conclusions and Suggestions
6.1. Suggestion for Shijiazhuang. In 2018, Shijiazhuang ranked 168th among 169 key cities with poor air quality in China, but this year, it plans to move out from the last 10. e plan represents an effort to improve the atmosphere. Combined with the specific situation of Shijiazhuang, the suggestions are given from the following aspects: (1) Firstly, as a city with a relatively dense population, Shijiazhuang should actively reduce the burden of the city and guide the transfer of population to surrounding counties and cities. is will not only reduce the pressure on the city, but also reduce the living pollution, traffic, and environmental pollution of Shijiazhuang. (2) e transformation of the industrial structure should be accelerated, and the transformation of urban development should be promoted from heavy to light. In addition, "reduce the weight" of the city should be taken seriously. e transformation of industries from high-emission industries to emerging industries and environmental protection industries should be guided by the government actively.
6.2. Suggestion for Chongqing. As a city with developed tourism and economy, the annual pollution brought by tourism cannot be underestimated. erefore, it is necessary   to reduce the content of the pollutants in the air. Suggestions are given from the following aspects: (1) Firstly, the main pollution should be cleaned up, such as dust pollution, coal pollution, and industrial pollution, which will lead to the increase of the pollutant content in the air and eventually lead to the decline of the air quality. (2) Secondly, because of the developed tourism in Chongqing, the traffic pollution caused by the huge population flow cannot be ignored. In order to reduce air pollution, citizens and the public should be encouraged to take public transportation instead of private cars.

Conclusions
It has been proved that the hybrid model is of great research significance and practicability in the M4 competition in 2019. However, the hybrid models with good prediction effect in M4 are the combination of artificial intelligence algorithm and statistical algorithm. It is shown that the forecasting effect of the hybrid model (FGM(0, m)-SVR) is better. e hybrid model is used to predict three air pollutants (PM 10 , PM 2.5 , and NO 2 ) in Shijiazhuang and Chongqing, and it is shown that the prediction accuracy of the hybrid model is significantly higher than that of the single model. e hybrid model can also be used to predict other air pollutants in other cities.

Data Availability
e data used in this study can be accessed via the website of Shijiazhuang Environmental Protection Bureau (http:// www.sjzhb.gov.cn) and Chongqing Environmental Protection Bureau (http://www.cepb.gov.cn/).

Conflicts of Interest
e authors declare that they have no conflicts of interest.