Forecasting Volatility with Time-Varying Coefficient Regressions

We extend the heterogeneous autoregressive(HAR-) type models by explicitly considering the time variation of coefficients in a Bayesian framework and comprehensively comparing the performances of these time-varying coefficient models and constant coefficient models in forecasting the volatility of the Shanghai Stock Exchange Composite Index (SSEC). )e empirical results suggest that time-varying coefficient models do generate more accurate out-of-sample forecasts than the corresponding constant coefficient models. By capturing and studying the time series of time-varying coefficients of the predictors, we find that the coefficients (predictive ability) of heterogeneous volatilities are negatively correlated and the leverage effect is not significant or inverse during certain periods. Portfolio exercises also demonstrate the superiority of time-varying coefficient models.


Introduction
Volatility is the key input variable for risk assessment, asset pricing, and portfolio allocation models. e early classic models are GARCH-type [1][2][3] and stochastic volatility (SV) [4] models, and because of the unavailability of high-frequency data, these models are based on daily or weekly returns. e omission of the informative intraday data makes these parameter volatility models not preferred. Volatility is an unobservable variable, and Andersen and Bollerslev [5] and Andersen et al. [6] suggested the sum of daily squared returns as the proxy of volatility and named it as realized volatility (RV). It makes volatility ex-postobservable, and the accuracy of RV is much higher than the proxies that are based on low-frequency data, such as square returns and intradaily range.
Corsi [7] built the heterogeneous autoregressive (HAR) model according to the heterogeneous market hypothesis. It is an AR (22) model that is restricted by economic interpretations. e model has 3 regressors, lagged daily, weekly, and monthly RV whose coefficients measure the impacts of short-term, medium-term, and long-term investors, and it can be easily estimated using the ordinary least squares (OLS) method. In spite of its simplicity, the HAR model not only successfully captures the main features of volatility, such as long memory, multiscaling behavior, and fat tails, but also produces smaller forecast errors than the GARCHtype and SV models.
Some researchers extend the HAR model by adding additional components to the basic HAR-RV model [8][9][10][11][12], such as negative returns, jump variation, signed semivariances, and overnight returns, and these extended models improve the performances in volatility forecasting on different aspects. e availability of various modeling approaches for volatility forecasting leads to model uncertainty for both researchers and practitioners, and model averaging approaches (i.e., (trimmed) mean combination, discount mean square prediction error (DMSPE) combining method, triangular weighting (TW) method, and the Bayesian model averaging) are empirically proved to be more effective in volatility forecasting than a single model in spot, futures, energy, and commodity markets [13][14][15][16].
It is well documented that leverage effect, volatility persistence, mean reversion, structural break, etc. are the typical features of the volatility process of asset prices [17]. Structural breaks in volatility imply that the coefficients of volatility models vary over time. In general, not only structural breaks but also noisy proxies, nonlinearity, and model specification errors may result in time-varying coefficients. Some researchers studied the effect of neglected coefficient changes on the persistence and level of volatility [18][19][20]. ey all reflect that it is reasonable to admit that the coefficients are time varying. Granger [10] proves that any nonlinear functional form can be replaced by a model that is linear in variables, but which has time-varying coefficients. As we know, fewer research studies relate time-varying coefficient methodology to the HAR-type models.
Our study makes 3 contributions in the field of volatility prediction. First, previous studies always use rolling or recursive window regressions to implement the out-of-sample volatility forecast, implying the coefficients of models as constant. Differently, we assume that the predictive abilities of the predictors as time varying and build time-varying coefficient (TVC) HAR-type models. e predetermined variables represent a kind of model uncertainty. We address this issue with the Bayesian model averaging approach. Inspired by the work by Raftery et al. [21], we introduce the forgetting factor to the state-space model, and the forgetting factor not only makes the coefficients evolve more efficiently at a reasonable speed, by reducing the impact of the obsolete data, but also simplifies the calculation of posterior distributions. Although Markov regime switching (MRS) admits the coefficients are time-varying, this method is ad hoc, neither systematic nor helpful in understanding the real changes of these coefficients over time. According to the degree of time variation in coefficients, we compare the performances of 3 types of models, constant-coefficients (CC), MRS, and TVC HAR-type models in volatility forecasting.
Second, by investigating the coefficient series of the TVC-HAR-type models, we find that the coefficients (predictive abilities) of heterogeneous volatilities are negatively correlated and the leverage effect is not significant or inverse during specific periods. Choi et al. [22] point out that, due to the presence of structural breaks, the persistence of volatility may be overstated. After a huge shock, the persistence of volatility and the impact of historical volatility on future volatility will be significantly weakened. erefore, the predictive ability of historical volatility for future volatility may change over time. Partial correlation coefficients of the models' coefficient series show that the predictive abilities of lagged short-, medium-, and long-term volatility for future volatility are negatively correlated. Negative returns always have a greater impact on volatility than positive returns in the stock market, and it is sometimes ascribed to a leverage effect.
e "leverage parameter" (i.e., the coefficient of negative returns) is always assumed to be constant. Corsi et al. [9] find that stronger leverage effects are time varying and empirically related to higher volatility regimes. Negative returns increase future volatility of the SSEC on the whole, but by investigating the time series of "leverage parameter," we find that the leverage effect is insignificant or inverse in specific periods. Fewer studies examine the time-varying predictive abilities of the variables in volatility forecasting.
Our last contribution to the literature is that we not only statistically evaluate the performances of the corresponding models but also examine the economic significance of these models. In evaluation exercises, most previous studies simply use the measures suggested by statisticians, such as mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percent error (MAPE), and we use the loss functions suggested by Patton [23], which are robust to market microstructure noise. Practically, investors care about the usefulness of a volatility forecast. Following Ferreira [24], we examine whether the volatility forecasts can be used to improve a mean-variance investor's portfolio. A mean-variance investor will decide at the end of period t to allocate his or her assets between stock index and risk-free asset in period t + l. According to the investor's utility function, the optimal weight of the stock index in this portfolio depends on the forecast of volatility. e results show that considering the time variation of coefficients makes the mean-variance investor have a greater excess return and utility.
Our data are 5-min intraday prices of the Shanghai Stock Exchange Composite Index (SSEC). HAR-RV and its 6 extensions are used to forecast 1-day-ahead volatility. Four robust loss functions, including mean square error (MSE) and quasilikelihood (QL), which are robust to noisy proxy RV, are used to evaluate the performances of these volatility models. e average losses of the models considering the time variation of coefficients are always the least, indicating that they generate the most accurate out-of-sample volatility forecast among CC, MRS, and TVC models. e performances of MRS-type models are between the values of CC and TVC-type models.
ere are 3 heterogeneous volatility components in the HAR model. Time-varying persistence of volatility results in the time-varying predictive ability of each component. By studying the series of time-varying coefficients, we find that the coefficients of the heterogeneous volatility components are negatively correlated, indicating their negatively correlated predictive abilities. We use our method to capture the time series of the "leverage parameter" which measures the correlation between price shocks and volatility and found that downside risk increases future volatility of the SSEC on the whole, but the leverage effect is not significant or inverse during certain periods. is finding is suggestive of the timevarying correlation between price and variance.
We use excess return and certainty equivalent return (CER) to measure the performances of these models in portfolio exercises. We find that the investors allocating their assets according to the forecast from TVC-type models always get more excess return and CER. In the Chinese stock market, which is a typical example of emerging markets, the investors with lower risk aversion coefficient have more excess return. Overall, the model that includes all the predictors and considers the time variation of coefficients outperforms the other models not only statistically but also economically. e remainder of the paper is structured as follows. Section 2 provides the backgrounds of RV and HAR-type models and introduces our time-varying coefficients model. Section 3 is the empirical analysis, and we compare the performances of the time-varying coefficients HAR-type models with constant coefficients and Markov regime switching HAR-type models through statistical and economic measures and report our empirical findings. Section 4 concludes the study. 2 Discrete Dynamics in Nature and Society

Realized Volatility and the HAR Model.
In an arbitragefree market, consider an asset whose price process p t follows the continuous-time jump diffusion process and is given by the following stochastic differential equation [25]: where µ t is a continuous and locally bounded variation process, W t denotes a standard Brownian motion, σ t denotes the instantaneous volatility which is a stochastic process independent of W t , dq t is a counting process whose intensity λt is time varying, J t denotes the size of a jump in the price process, dq t � 1 refers to a jump at time t, and dq t � 0 refers to no jump. e increment of the QV from t−1 to t is defined as where the first term t t−1 σ 2 s ds is the integrated variance (IV) and the second term t−1<s≤t,dq s �1 J 2 s is called the jump variation. To calculate RV, which is defined by Andersen et al. [6], without loss of generality, we normalize the daily time interval to 1 and then divide it into M periods. e length of each period is Δ � 1/M. e j-th log return on the t-th day is RV is the summation of the corresponding 1/Δ highfrequency intraday squared returns: As emphasized by quadratic variation theory [5,6], the realized variation converges in probability to the quadratic variation process as M ⟶ ∞: For the HARCH model, Dacorogna et al. [26] argued that long and short-time horizon volatility propagates asymmetrically. Motivated by the HARCH model and RV, Corsi [7] build the HAR-RV model which is almost the standard model for volatility modeling and forecasting.
ere are 3 regressors in the HAR-RV model, lagged oneday volatility RV t , lagged one-week volatility RV t: t−9 , and lagged one-month volatility RV t: t−21 , and here, RV t: t−h � (1/h) h j�0 RV t−j . e HAR-RV model proposed by Corsi [7], which is a special case of the autoregressive (AR) model by imposing economically meaningful restrictions, is expressed as

Extensions of the HAR Model.
e simplicity of the model makes it easily extendable by adding additional predictors. Our choice of additional predictive variables is guided by previous academic studies.
ere are so many combinations of these predictors, so we build 2 types of models. e 1st type of the models only includes a single class of additional variables, and the 2nd type is a model which includes all the additional variables.
Asymmetry is a well-documented stylized fact about stock volatility, and it means that positive and negative shocks have different impacts on the volatility. Black [27] named it as a leverage effect. Just like the EGARCH and GJR-GARCH models, we add the negative returns as an additional predictor to the basic HAR model and build the LHAR-RV model, where 'L' stands for leverage.
Andersen et al. [8] add jump variation to the basic HAR model and build the HAR-RV-J model. e empirical analyses of the equity index returns suggest that the volatility jump variation is highly important but less persistent than the integrated variance, extracting jump variation results in significant out-of-sample forecast improvements. e HAR-RV-J is expressed as where J t is an estimator of the jump variation. Andersen et al. [8] proved that, as the sampling frequency of the underlying returns increases, e information flow about stocks is continuous, but major stock trading is limited in just a few hours a day. A lot of financial and economic events take place during close-toopen periods, and overnight information has an important influence on future volatility of stock markets. Emerging markets like China and Brazil are inevitably affected by mature markets such as the United States and Britain. Trading hours for the US stock market is the time when the Chinese stock market is closed. Overnight returns have been used as a proxy of overnight information flow to enhance forecast accuracy of volatility modes in recent literature [12,28,29]. Tseng et al. [28] argued that the impact of overnight returns on future volatility is also asymmetric. So, negative overnight returns are added to the basic HAR model, termed as LHAR-RV-OR.
RV which is an even function of high-frequency intraday returns (sum of squared returns) neglects the fact that the impacts of positive and negative returns on future volatilities are different. Barndorff-Neilsen et al. [30] developed a new measure, realized semivariance, which relates to positive and negative intraday returns, and named them as "good volatility" and "bad volatility," respectively. eir empirical study shows that the findings produce significantly better performance in out-of-sample volatility forecasting. Suppose the price process p t follows (1), they decompose RV t into signed realized semivariances RS + and I(·) is the indicator function. According to equations (3), (9), and (10), RV t � RS + t + RS − t . ey proved that as the sampling frequency M ⟶ ∞, We replace the predictor RV t in (5) as RS + t and RS − t to test whether the positive and negative semivariances have asymmetric influences on future volatility and build the AHAR-RV model, where "A" stands for asymmetric: e model (13) encompasses the basic HAR model [7] by setting β + 1 � β − 1 . More importantly, they defined signed jump variation which measures the difference between positive and negative jump variations. It is positive/negative when the price is dominated by upward/downward jumps: e simplicity of signed jump variation is that it does not need to know or estimate the corresponding jump variation which may be noisy. To investigate the predictive ability of signed jump variation, Barndorff-Neilsen et al. [30] extend the HAR model by adding signed jump variation to the model and name it as HAR-RV-SJ, here 'SJ' stands for signed jump: Because RV t and ΔJ t can be linearly represented by RS + t and RS − t , equation (15) and equation (13) are equivalent, and the empirical study also verifies that. We do no exhibit the results corresponding to the model HAR-RV-SJ, equation (15). e sequential information arrival hypothesis [31] and the noise trading hypothesis [32] both imply that a causal relation between return volatility and trading volume exists. Lamoureux and Lastrapes [33] find volume, a proxy of information flow, to be strongly significant when it is inserted into the ARCH variance process. Le and Zurbruegg [34] propose to introduce trading volume into ARCH-type models to improve volatility forecasting, and the empirical research results are robust to different measures of volatility and trading volume. Wang and Huang [35] find that daily integrated variance (IV) is positively related to trading volume, but jump variation reveals a negative relationship with trading volume and contains some "public information." ese researches all proved that trading volume can be exploited for forecasting volatility purposes. According to Lamoureux and Lastrapes' approach [33], we use the turnover ratio as the proxy of trading volume which represents the arrival of information and build the HAR-RV-V model, where "V" stands for volume: According to RV t � RS + t + RS − t and equation (14), if predictors RV t , RS + t , RS − t , and ΔJ t are included in a model, it will result in perfect collinearity. It is verified in our empirical analysis; these coefficients cannot be estimated by OLS and out time-varying coefficients method. Because RS + t and RS − t contain the information of RV t and ΔJ t , to avoid perfect collinearity, we build the model HAR-RV-ALL that contains all the predictors mentioned above except RV t and ΔJ t :

Time-Varying Coefficient
Regression. An important purpose of our study is to evaluate whether the time-varying coefficient paradigm can be used to improve volatility forecasting. Suppose k is the number of predictive variables that a constant-coefficient model contains (including the intercept term); to reduce the risk of model selection, we build and estimate 2 k − 1 time-varying coefficient submodels that are associated with all the combinations of the predictive variables, and the model that does not contain any predictor is not considered. en, we use the Bayesian criterion to calculate the posterior probability weight of each submodel (similar to the study by Avramov [36]), and the final predictive mean of volatility is the weighted average of predictive means of all submodels. We first focus on describing the characteristics of a time-varying coefficient submodel that contains a given selection of predictive variables, and then we introduce the Bayesian model selection criterion.
It is natural to relate the time-varying coefficients to the state-space model, so we build the time-varying coefficient submodel as where F t is the k × 1 vector of all the predictors. Suppose Discrete Dynamics in Nature and Society coefficients which determine the impact of F t on RV t+1 at time t, then θ t evolves according to equation (19). For example, for equation (6), there are 5 predictors in the model, and we build 2 5 -1 � 31 time-varying coefficient submodels. F t can be any combinations of the predictive variables, such In observational equation (18), the innovations ε t are distributed as ε t ∼ N(0, V), V denotes the unknown conditional variance; similarly, W t denotes the time-varying conditional variance matrix in state equation (19), and the specification is similar to the Bayesian framework for stochastic volatility suggested by Johannes, Korteweg, and Polson [37]. Equation (19) suggests that θ t follows a random walk process that can capture various variations of coefficients. Specifying a normal distribution as prior for θ 0 and an inverse-gamma distribution for the observational variance V 0 results in a conjugate Bayesian analysis. It means that the prior and posterior distributions for θ 0 and V 0 are from the same type of distributions [38]. Suppose that F t is the information set available at time t, then F t contains RV t and all the corresponding predictive variables F t up to time t and the priors θ0 and V0. At time t � 0, according to the study by Cremers [39] and West and Harrison [40], we set the prior where IG((]/2), (κ/2)) denotes the inverse gamma distribution with shape parameter ] and scale parameter κ and where I is the identity matrix and k is the number of predictive variables in the model. At time t, the forecast of RV t+1 can be calculated by integrating the predictive density of RV t+1 over the range of coefficients vector θ and observational variance V (see Appendix for more details about the mathematics of the time-varying coefficient submodel).
As mentioned above, there are a large number of predictive variables that can be used in volatility forecasting, and the existence of so many predictive variables results in a huge number of time-varying coefficient submodels, and it makes the Bayesian averaging computation infeasible. According to the suggestion of Raftery et al. [21] and Jazwinsky [41], we introduce a forgetting factor λ to model the system variance matrix W t .
Suppose the posteriors of the coefficients where W t is assumed to be proportional to the estimation variance matrix of coefficients θ t |F t ; specifically, it is assumed that where λ is the forgetting factor, 0 < λ ≤ 1, and W t is related to the magnitude of a shock that is controlled by the forgetting factor λ. If the forgetting factor λ � 1, the system variance matrix in equation (19) W t � 0, t � 1, 2, . . ., T, and it means that the shock has no influence on the coefficients and θ t � θ t−1 , and the coefficients θ t are constant all the time. So, equations (18) and (19) nest the constant-coefficient linear model. If the forgetting factor 0 < λ < 1, the coefficients will vary over time.
e less the forgetting factor λ is, the relatively larger shock hits to the coefficients, the higher the evolution speed of these coefficients is.
In empirical studies, due to our ex-ante selection of the set of predictive variables, there is uncertainty about whether the variables have significant predictive abilities in volatility forecasting during specific periods. e predetermined variables represent a kind of model uncertainty. We address this issue with the Bayesian model averaging approach which has been applied to volatility prediction and performs well [16,42]. For example, Lyócsa et al. find that combination forecasts, especially with univariate specifications or Bayesian model averaging, conclusively outperform the benchmark in forecasting the volatility of nonferrous metal futures [16]. We can build 2 k −1 submodels with predetermined k predictors, and their posterior probabilities are updated day by day according to the Bayesian approach. e final predictive density of RV t+1 is the weighted average of the predictive density of all time-varying coefficient submodels, and the probability of each submodel conditional on the current information Φ t depends on the corresponding predictive likelihood. e details about the Bayesian model averaging approach are given in the Appendix.
By applying the time-varying coefficient regression method to the previous HAR-type models, we get the timevarying coefficient HAR-type models. For example, we improve the basic HAR model in equation (5) as the timevarying coefficient HAR (TVC-HAR) model and improve the HAR-RV-J model in equation (7) as the TVC-HAR-RV-J model; similarly, we improve all the other models.

Data Description.
Many studies focus on the developed stock markets, such as the US, UK, and Japan. Differently, we intend to study forecasting volatility of the Chinese stock market which is a typical example of the emerging markets. Despite its higher-than-average return, empirical studies show that the Chinese stock market is counter-cyclical and more volatile than the mature markets. On January 1, 2018, MSCI included China A-shares in the Emerging Markets (EM) Index and the ACWI (All Country World Index) Index. e weight of Chinese A-shares in the EM Index is 33.00% from the data as of March 29, 2019. Exploring and forecasting volatility of the Chinese stock market are very meaningful for risk management, asset pricing, and global asset allocation.
Our data are 5-minute intraday high-frequency data of the SSEC published by Wind Information CO., LTD, and there are 48 observations on each trading day. Our data cover from November 8, 1999, to April 23, 2018, and 4465 trading days in total, approximately 18 years. e SSEC index consists of all listed stocks (including A shares and B shares) on the Shanghai Stock Exchange (SSE), and it reflects the overall performance of the Chinese stock market. Figure 1 presents the time series plot of these variables for Discrete Dynamics in Nature and Society the SSEC. According to panel A of Figure 1, the RV of the SSEC is clustering, and it was soaring during the 2008 subprime crisis and 2015-16 the Chinese stock market turbulence, accompanied by fast-rising risk.
e absolute values of these variables become greater during financial crises, and all the variables are correlated but clearly contain different predictive information. It is hard to directly distinguish their predictive abilities. Table 1 presents some statistics for our data. e mean of J t , RS + t , and RS − t are all smaller than that of RV. It makes sense, as J t , RS + t , and RS − t are the components extracted from RV. e augmented Dickey-Fuller (ADF) tests (without constant and trend) show that all the variables reject the null hypothesis of unit root at a 1% significance level, implying that all these variables are stationary and will not result in spurious regressions. e Ljung-Box Q statistic for serial correlation up to 22 lags shows that RV of the SSEC displays a significant level of autocorrelation, implying RV of the SSEC exhibits persistence.

3.2.
In-Sample Estimation Results. As mentioned above, we use the constant and time-varying coefficient models to forecast the RV of the SSEC. We first estimate the coefficients of the models listed above, expressed in equations (5)-(8), (13), (16), and (17), and then evaluate whether the variables are significant in forecasting future volatility. e variables in the models are autocorrelated and heteroscedastic, so the Newey-West covariance correction is employed. Table 2 exhibits the in-sample estimation results for the models.
From Table 2, we find that adjusted R 2 of the basic HAR model is the lowest, and for the models with only one additional predictor, the additional predictors are all significant for future volatility except positive semivariances, but for the last model that includes all the predictors, the predictors are significant except jump variation. It illustrates that the information content of jump variation is encompassed by other variables. It is consistent with the previous findings by Andersen et al. [8]. By comparing the adjusted R 2 of model (6) and model (13), we find that the explanatory ability of negative semivariances is not superior to that of negative returns for the Chinese stock market. is is different to the American market [30]. ere are 2 reasons. First, negative semivariances do not contain overnight information, and the Chinese stock market is significantly influenced by the overnight information, especially the information from the American market. Second, the Chinese stock market is dominated by individual investors that rarely get the high-frequency intraday trading information, and low-frequency daily returns are always used to make investment decisions. e model that contains all of the predictors, expressed by equation (17), has the greatest coefficient of determination and greatly improves the predictive ability of the basic HAR model. e coefficient of determination raises from 0.4915 to 0.5755 which is significantly greater than that of the models with only one additional predictor.

Out-of-Sample Forecast.
To compare the out-of-sample performances of constant coefficient, MRS, and timevarying coefficient HAR-type models, we consider the loss functions which are robust to market microstructure noise [23]. ey are defined as where σ 2 denotes an unbiased ex-postproxy of real volatility, such as squared returns, intradaily range, RV, and realized kernel (RK); we set it as 5-minute RV; h denotes the forecasted volatility; when b � −2, it is the quasi-likelihood (QL) loss function which is closely related to Gaussian likelihood; when b � 0, it transforms into mean squared error (MSE).
To employ the MRS-HAR-type models, we consider two regimes that are related to the high and low levels of volatility. For brevity, we do not introduce the algorithm of Markov regime switching, please refer to Hamilton and Susmel [43], Gray [44], and Ma et al. [45]. A rolling window of 1000 observations is employed for the out-of-sample forecasts from CC-HAR-type models. We set λ � 0.994 to employ the TVC-type models. According to our experiment (not reported), the performances of TVC-type models change with the value of forgetting factor λ. When the forgetting factor λ is set as 0.994, these TVC models perform relatively better than other choices. When λ ∈ [0.99, 0.996], the losses of TVC-type models are all less than that of the corresponding CC-type and MRS-type models, implying that our results are robust with the choice of λ. Table 3 reports the average losses which are generated from the constant time-varying coefficient, MRS, and timevarying coefficient HAR-type models mentioned before.

Economic Interpretation.
It is well documented that the volatility of equity returns exhibits asymmetry and persistence. It is very convenient to study the time variation of leverage effect and the strength of persistence in the timevarying coefficient regression framework. In this section, we first study the time-varying leverage effect and its relationship to the stock market cycle by capturing the time variation of the "leverage parameter." Further, we study the correlation of predictive abilities of the heterogeneous structure volatility. It is related to the persistence of volatility. Finally, we examine the economic value of the volatility forecast by using a volatility timing strategy.

Time-Varying Leverage Effect.
For equity returns, negative shocks have a greater impact on future volatility than positive shocks. e stylized fact is named as ''leverage effect'' by Black [27]. Many volatility models specify that volatility is affected asymmetrically by positive and negative shocks, i.e., Glosten et al. [46], Engle and Ng [47], Harvey and Shephard [48], Bakshi et al. [49], and Wu and Hou [50]. Bandi and Renò [51] point out that the correlation between shocks to prices should not be assumed constant and provide a nonparametric estimation to evaluate the variation of the leverage effect. ey find that stronger leverage effects are empirically related to higher volatility regimes. Patton and Sheppard [1] use signed semivariances, new estimators proposed by Barndorff-Neilsen et al. [42], which are calculated by signed high-frequency returns, and find that negative realized semivariance has a more significant impact on future volatility than positive realized semivariance and extracting the positive and negative realized semivariances from RV significantly improves the performances of the volatility forecast. Negative returns and negative semivariances measure the downside risk of assets; similarly, the positive returns and positive semivariances measure the upside risk of assets, but the latter variables are not significant for forecasts of volatility. According to the view of Bandi and Renò [51], leverage is defined as a function of spot volatility level, and the impact of downside risk on future volatility is time varying. In our method, the coefficients of these predictors are allowed to vary over time, and the posterior point estimator of the coefficient θ t is where E(θ t,i | F t ) is the conditional expectation of the coefficient θ t from the submodel M i . So, it is very convenient to capture the variation of the "leverage parameter" (i.e., the time-varying coefficient of negative returns, β I(r t < 0)|r t |,t in equation (6)). Figure 2 plots the time series of the "leverage parameter". Panel A is the plot of β I(r t < 0)|r t |,t , and Panel B is the plot of β − 1,t ; the coefficients of negative realized semivariances RV − t are given in equation (13). For a better understanding of the variation, the historical prices of SSEC are also exhibited.
From Figure 2, we observe that, in most cases, the coefficient of negative returns and negative realized semivariances, β I(r t < 0)|r t |,t and β − 1,t are greater than zero, indicating that downside risk, which is measured by negative returns and negative realized semivariances, increases future volatility of the SSEC. However, there are some special cases. For Panel A, β I(r t < 0)|r t |,t is less than 0 in two time periods, from 2006-11-04 to 2007-01-25 and from 2014-12-02 to 2015-01-22; for Panel B, β − 1,t is less than 0 in the period, from 2014-12-05 to 2015-01-27, indicating that the leverage effect is inversed, and downside risk reduces future volatility of the SSEC. e information contained in negative returns and negative semivariances is different. Although negative returns do not contain the information of high-frequency intraday returns, they contain the information of overnight returns. e mean of semivariances is 59.74% of the mean Discrete Dynamics in Nature and Society square negative returns. Overall, negative returns contain more information than negative semivariances, and it is consistent with the previous results ( Table 2) that the R 2 coefficient of equation (6) is greater than that of equation (13). So, it is not difficult to understand the phenomenon that the periods, when the coefficients of negative semivariances and returns are less than 0, are related but different. We also compare the results from time-varying coefficient models with the OLS results. For the whole sample, from Table 2, we know that the OLS estimation of the coefficient β I(r t < 0)|r t |,t in equation (6) is 6.336e − 03 (Newey-West adjusted std. dev.: 1.272e − 03, p value: 1.076e − 06), indicating that the SSEC exhibits significantly ordinary leverage effect on the whole. For equation (6), the OLS estimations of the coefficient of β I(r t < 0)|r t |,t are −4.888e − 03 (Newey-West adjusted std. dev.: 1.677e − 03, p value: 0.00526) and 0.3400e − 03 (Newey-West adjusted std. dev.: 0.7608e − 03, p value: 0.6581) during the 1 st and 2 nd special periods. ey are both less than zero, and for the 1 st period, β I(r t < 0)|r t |,t is significantly different to zero, but for the 2nd period, it is not. e results are evidently different to the whole sample and challenge the classic stylized fact of volatility. Similarly, for the whole sample, from Table 2, we know that the OLS estimation of the coefficient of β − 1 in equation (13) is 7.936e − 01 (Newey-West adjusted std. dev.: 1.293e − 01, p value: 9.335e − 10). During 2014-12-02 to 2015-01-22, the OLS estimation of the coefficient β − 1 is −1.388e − 01 (Newey-West adjusted std. dev.: 7.875e − 02, p value: 9.184e − 02). It indicates that the realized negative semivariances reduce future volatility of the SSEC, and it is different from zero at a 10% significance level during that period. If we manually and properly divide the time series of the SSEC into several periods, we can also identify different types of leverage effect, but the division of SSEC is inevitably subjective. Our model yields a parsimonious method with just one forgetting factor to capture different types of leverage effect. e common feature of these three periods, when the coefficients of negative semivariances or negative returns are less than zero, is that the SSEC is in a rapid rise. Our point is similar to Zhang et al. [52], where they attribute the presence of inverse leverage effect in the crude oil market to its scarcity, nonrenewable property, and very different behavior of suppliers and demanders. In a bull market, although the number of stocks does not change, investors' demand for stocks rises rapidly, and it is similar to the scarcity and nonrenewable property of crude oil. Little evidence of inverse leverage effect is found in mature stock markets, and the main characteristics of emerging markets, i.e., countercyclical, high volatility, less mature capital markets, immature individual investors, undeveloped institutional and individual investors, more susceptible to volatile currency swings, and higher-than-average return, are all related to the appearance of inverse or no leverage effect. We find the facts that accompany the inverse or insignificant leverage effect, including the rapid YOY growth in the number of new stock accounts and the decrease in the average of free turnover ratio for the days after negative returns.

Correlation of the Heterogeneous Coefficients.
In the basic HAR model, the heterogeneous structure volatility, RV t , RV t:t−4 , and RV t:t−21 , correspond to the market expectations of the next period's volatility which are based on the observations of past realized volatilities and their marginal contributions to future volatility are measured by their coefficients. Frijns et al. [53] point out that the structure of volatility may change due to the time-varying behavior of heterogeneous investors. Since the coefficients of heterogeneous predictors are time varying, by studying the correlations of the heterogeneous coefficients, we can find the mutual influence of the predictive ability of these variables. In multivariate correlation analysis, because of the interaction between the variables, the simple correlation coefficient cannot reflect the real correlation between the two predictors. erefore, we use the partial correlation coefficient to measure the correlation between the coefficients of Discrete Dynamics in Nature and Society predictors. Let r 1,5 denote the partial correlation coefficient of RV t and RV t:t−4 adjusted for other predictors and r 1,22 and r 5,22 denote the corresponding partial correlation coefficients. For the basic time-varying coefficient HAR model (equation (5)), r 1,5 � −0.567, r 1,22 � −0.341, and r 5,22 � −0.472, and these coefficients are significantly and negatively correlated. We also check the other time-varying coefficient models, and the results are similar, but it seems there are no common features of the correlations of the other variables. e coefficients of daily, weekly, and monthly volatilities whose means are equal all show significant negative correlations, so their predictive abilities of future volatility are also negatively correlated. According to the view of Brownlees et al. [54], during calm periods, volatility is low and exhibits strong persistence, investors prefer medium-or-long-term trade, the mid-and long-term traders dominate market volatility, and the medium-and-long term volatility has a stronger predictive ability for future volatility; during periods of turmoil, accompanied by the high levels of volatility, the persistence of volatility is significantly weakened, investors are inclined to trade more frequently, shortterm traders play a leading role in the market, and short-term volatility has a stronger predictive ability for future volatility. erefore, in different periods, volatility at different horizons has time-varying predictive abilities for future volatility. e correlation of predictive abilities of the heterogeneous volatilities is that one falls and others rise. is also verifies the irrationality of fixed-coefficient assumption.

Portfolio Exercises.
In evaluation exercises, it is usual to use the statistical loss functions mentioned above; but practically, to check the economic value of a volatility forecast for investors, we should use the loss functions which are related to the utility functions of investors. Following Ferreira [24] and Santa-Clara and Neely et al. [55], among others, we consider a mean-variance investor who will decide at the end of period t to allocate his or her assets between stock index and risk-free asset in period t + l. e portfolio return is given by R p,t+1 � w t+1 r t+1 + R f,t+1 , (22) where w t+1 is the weight of the stock index in this portfolio, so 1 − w t+1 is the weight of risk-free bills; r t+1 is the return of the stock index over risk-free rate; R f,t+1 is the risk-free rate. We use certainty equivalent return (CER), which incorporate individual investor risk preferences into financial decisions, as the investor's utility function to evaluate the performance of this portfolio. CER can be interpreted as the lowest risk-free rate that an investor would accept rather than holding a given portfolio. e mean-variance investor's expected utility function, CER, is expressed as where E(R p+1 ) and Var(R p+1 ), respectively, denote the expected return and the variance of the portfolio and c is the relative risk aversion coefficient. At the end of day t, by maximizing the utility function U(R p+1 ), the investor optimally allocates the weight of stock index in day t+1: where r t+1 and σ 2 t+1 , respectively, denote the mean and volatility forecast of the index excess returns. Following Santa-Clara and Neely et al. [55], the historical average of excess returns r t+1 is the benchmark which is hard to exceed, and we use r t+1 as the mean forecast, r t+1 � r t+1 . e weight w t is constrained between 0 and 1.5 (inclusive). It is a financial constraint that precludes short sales and more than 50% leverage. From equation (24), the optimal weight of stock index w t is inversely proportional to risk aversion coefficient c, and the greater risk aversion coefficient is, the lower weight of stock index that the investor would allocate in his or her portfolio.
From equations (23) and (24), we know that CER is determined by 3 parameters: risk-free rate R f,t+1 , risk aversion coefficient c, and the forecast of volatility σ 2 t+1 . We use 3-month Shanghai Interbank Offered Rate (SHIBOR) as the risk-free rate R f,t+1 , and our data are from October 9, 2006, when the Chinese central bank put forward SHIBOR, to April 23, 2018. For robustness purposes, we set c as 3 and 6. R f,t+1 and c are both known variables and volatility is the only input variable for this portfolio, so the performance of this portfolio is determined by the effectiveness of the volatility forecast. Table 4 shows the portfolio performance measures (annualized) from 2006-10-9 to 2018-4-23, including the mean of excess returns (R) and CER. Under the premise of fixed risk aversion coefficient, all the average annualized excess returns and CER of the portfolio formed by the forecasts of volatility from constant coefficient models are smaller than that of the corresponding time-varying coefficient models, indicating time-varying coefficient strategy resulting in portfolios with better performances. For equation (17), the mean of excess returns (R) formed by the time varying or constant coefficient strategy is 26.079% or 25.420% during the same period, and the average return of the SSEC index is just 2.015%, implying that the investor get significant excess returns by considering the realized volatility of the SSEC index. We find that these strategies successfully reduce the weight of the stock index during the stock market crash in 2009 and 2015-16 and increase the weight during calm periods. For the models that include only 1 additional variable, the best model is equation (6), indicating that the lagged negative return is the most import additional variable for asset allocation, and it not only significantly affected future volatility, but also the excess returns of the portfolio. Equation (17), which contains the most additional variables, performs best, indicating that the corresponding additional variables do improve portfolio performance. For the same strategy, the less the risk aversion coefficient, the portfolio performs better, and because the returns of the Chinese stock market are very high and highly related to the volatility level, the lower-risk-aversion coefficient investors get more excess return. e MRS-type models do not always outperform the CC-type models, i.e., the MRS-HAR-RV-V model generates less excess return and CER of the corresponding CC model. e excess return and CER generated by the LHAR-RV, AHAR-RV, and HAR-RV-ALL models and their extensions are much greater than the other models. Our interpretation is that containing the variables that measure the downside risk of the index results in a significant improvement in volatility forecasting, effectively avoiding the decline of the index and higher excess returns.

Conclusions
Structural breaks, noisy proxies, and model specification errors indicate that the instability of coefficients represents an important challenge in volatility forecasting. We extend the HAR-type model by explicitly considering the time variation of coefficients and apply these models in forecasting realized volatility of the SSEC. e empirical results demonstrate that, statistically, the time-varying coefficient models generate more accurate forecasts than the MRS and constant coefficient models. In portfolio exercises, we also find that our models are helpful in generating more excess returns and improving the utility of a certain risk aversion investor.
In our time-varying coefficient regression framework, we find the predictive abilities of the three heterogeneous volatility components to be negatively correlated and quite different during calm and turbulent periods. We evaluate the variation of the leverage effect by capturing the time-varying "leverage parameter"; downside risk increases future volatility of the SSEC index on the whole, but the leverage effect is insignificant or inverse during bull markets. Our findings indicate the importance of considering time variation of coefficients, suggesting that practitioners and market regulators should treat volatility of the stock market with a dynamic perspective. For further enhancements, future research may explore the potential of time-varying coefficient models for forecasting the realized volatility of energy, precious metal, and foreign exchange markets.
where R t is the variance of prior distribution of θ t |F t .
Once we observe RV t+1 , we can calculate the error in prediction as e t+1 � RV t+1 − RV t+1 . e posteriors about θ t and V follow IG((n t+1 /2), (n t+1 S t+1 /2)) and T n t (m t+1 , C t+1 ), where n t+1 � n t + 1, where A t+1 is the adoptive vector A t+1 � (R t F t /Q t+1 ) (see West and Harrison [40]). After we get the forecast of volatility from each submodel, we combine these forecasts through the Bayesian model averaging approach. Suppose M i denotes a submodel which contains a certain choice of predictors from a set of n � 2 k −1 candidates, then RV t+1,i , which denotes the point estimator of RV t+1 from M i is e prior weight of each submodel is set as p(M i | F 0 ) � 1/(2 k − 1), i � 1, 2, . . . , 2 k − 1. As a new observation of RV arrives, probabilities of submodels are updated using the Bayesian recursions: where RV t,i and ��� Q t,i are the point estimators of the mean and variance of the predictive density of RV t from submodel M i . T n t−1 is the density of Student's t distribution with degree of freedom n t-1 .
RV t+1 , which is the predictive mean of RV t+1 conditional on Φ t , is a weighted average of predictive mean of each submodel: (A. 12) Data Availability e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.