Measuring and Forecasting Volatility in Chinese Stock Market Using HAR-CJM Model

and Applied Analysis 3 daily return volatility, while most literatures on RV at present (such as Wang et al. [13] and Corsi [9]) have not taken it into consideration. According to researches of Martens [20] and Koopman et al. [6], considering the overnight return variance, we adjust RV as


Introduction
Persistent volatility in financial markets is one of the most ubiquitous forms by which economic phenomena may be observed.Thus, it does not come as a surprise that a principal aim of the scholars in the fields of financial practices, ranging from the financial risk measuring to asset pricing, and to financial derivatives pricing, is the search for mechanisms to measure and forecast the volatility.
To measuring and forecasting the volatility, Engle [1], Bollerslev [2], and Taylor [3] proposed the ARCH model, GARCH model, and SV model, respectively.Hereafter, these models have been extended continuously and formed into the GARCH-type and SV-type models.Although the GARCHtype and SV-type models have made certain progress in measuring and forecasting the volatility of financial markets, they cannot describe the whole-day volatility information well enough as they are set up in low-frequency time sequences.Therefore, there exist some flaws in these models.With the great development in computer technology in recent years, the cost of recording and saving financial high-frequency data has been greatly reduced; thus, the financial highfrequency data has increasingly made an important means of studying the volatility of financial markets.Andersen and Bollerslev [4] first used the high-frequency data to propose a new method of measuring volatility, that is, the realized volatility (RV).Compared with the historical GARCH and SV model, RV carries superiority with it that it has no model, provides convenience for calculation, and is more accurate in measuring the volatility of financial markets.Thus, its appearance has greatly promoted the development of volatility models.Meanwhile, it can be widely applied to the fields of financial theory study and investment.
Since Andersen and Bollerslev [4] proposed RV, volatility models that take the high-frequency data as sample have developed rapidly and made great success in measuring and forecasting the volatility in financial markets.Andersen et al. [5] gave the theoretical explanation to RV and found that RV had obvious a long memory character by studying American exchange or stock markets.Koopman et al. [6] added RV to the SV and ARFIMA model to set up the SV-RV and ARFIMA-RV model, respectively, and found that new models with RV added had obviously better volatility forecasting performance than the old ones.Wei and Yu [7] and Wei [8] assessed many volatility models of their forecasting accuracy in future volatility on Shanghai composite index and Hushen 300 index in China, finding that the ARFIMA-lnRV and SV-RV model had better forecasting performance which were obviously better than volatility models like the GARCH model, whose conclusion was similar to that of Koopman et al. [6].
Furthermore, Corsi [9] proposed a Heterogeneous Autoregressive with Realized Volatility (HAR-RV) model in accordance with the Heterogeneous Market Hypothesis proposed by Müller et al. [10] and the long memory character of RV.The result showed that the HAR-RV model had good forecasting performance on future volatility which was obviously better than models like the GARCH and ARFIMA-RV model.In China, Zhang et al. [11] also found the HAR-RV model showed much better out-ofsample forecasting performance than the ARFIMA model.Andersen et al. [12] and Wang et al. [13] decomposed RV into the continuous sample path variation and discontinuous jump variation on the basis of the HAR-RV model, and set up a Heterogeneous Auto-Regressive with Continuous volatility and Jumps (HAR-CJ) model, which greatly improved the accuracy of forecasting future volatility.Andersen et al. [14] found that the overnight return variance played an important role in the daily asset volatility, so they added the overnight return variance to the HAR-CJ model and set up an HAR-CJN model.With comparative analysis on model's forecasting performance, they found that the HAR-CJN model performed better than the GARCH and HAR-RV model in forecasting the future volatility at 1 day, 1 week, and 1 month.
From the above-mentioned studies, we can find that the RV-type models (especially the HAR-RV and HAR-CJ model) always have better forecasting performance on the future volatility than the GARCH and SV model, and the HAR-CJ model has the best forecasting performance in these models.Although the HAR-CJ model has good forecasting performance for the forecasting of future volatility, higher accuracy is more favorable to the analysis of practical financial problems such as financial risk measuring, asset pricing, and financial derivatives pricing.Therefore, it is necessary to further improve the forecasting performance of model.So as to improve the forecasting accuracy of models, scholars used to add some variables to existed models according to financial theories and market operational mechanism, such as the SV-RV model based on SV model set up by Koopman et al. [6] and Wei [8], the HAR-RV-J model based on HAR-RV model set up by Zhang et al. [11], the HAR-L-M model based on HAR-RV model set up by Zhang and Tian [15] and so on, which all have better forecasting accuracies than their base models.Grounded on this, we attempt to add the irrational factors of investors to the HAR-CJ model for improving its forecasting performance on the volatility of Chinese stock market.Many researches show that investors' irrational behaviors produce great influences on the volatility of financial markets.Jegadeesh and Titman [16] brought forward the momentum effect, and they pointed out that the return of stock had a trend of lasting the previous direction of moving.Researches of Grinblatt and Han [17] and Frazzini [18] also showed that the momentum effect made it a positive correlation between the previous gains and losses of financial asset and the current ones, respectively.It can be concluded that the momentum effect can help with the rise and fall of the market, increasing the volatility of market.Thus, we propose in the perspective of Behavioral Finance Theory, add the momentum effect factor (the capital gain overhang) to the HAR-CJ model, consider the overnight return variance at the same time, convert RV into adjusted realized volatility (ARV), and set up the HAR-CJ-M model.Afterwards, we proceed to use the HAR-CJ-M, HAR-ARV, and HAR-CJ model to study the volatility in Chinese stock market.On one hand, we are to test the influence of momentum effect in Chinese stock market volatility; on the other hand, with the comparison of this new model with the HAR-ARV and HAR-CJ model on their volatility forecasting performance in Chinese stock market, it can help us find better models to measuring and forecasting volatility in Chinese stock market.
The remainder of this paper is organized as follows.In Section 2, the theories about the HAR-CJ-M model are introduced.In Section 3, the HAR-ARV, HAR-CJ and HAR-CJ-M model are established.In Section 4, the comparative analyses of the model's volatility measuring and forecasting performance in Chinese stock market are given.We also conclude this paper in Section 5.

Preliminaries and Theories
2.1.Adjusted Realized Volatility.According to the calculation method of RV by Andersen and Bollerslev [4], we suppose a trading day , divide the total day trading into  parts, and  , is the th ( = 1, . . ., ) closing price of the trading day .What is more, we suppose  , is the return of the th on trading day , namely,  , = 100(ln  , − ln  ,−1 ).Therefore the RV on trading day  (RV  ) can be written as Hansen and Lunde [19] pointed out that Andersen and Bollerslev [4] researched RV on exchange market.But trade was not made continuously in 24 hours on stock market like that on exchange market, so RV calculated with expression (1) could only reflect the market volatility for trading periods but not for the market volatility information in periods which no trading was made (namely, the market volatility aroused by overnight information-the overnight return variance from the closing of the previous day to the opening of that day).In addition, Hansen and Lunde found that only when the overnight return variance and RV were combined could they become more approximate to the consistency estimation of integrated volatility.Research of Andersen et al. [14] also showed that the overnight return variance  2 , in SP and US markets made up 16.0% and 16.5% of the total return volatility, respectively, namely,  2 , /(RV  +  2 , ) equaled 0.160 and 0.165, respectively.Consequently, the overnight return variance played a quite important part in calculating the total daily return volatility, while most literatures on RV at present (such as Wang et al. [13] and Corsi [9]) have not taken it into consideration.According to researches of Martens [20] and Koopman et al. [6], considering the overnight return variance, we adjust RV as where  ,1 and  , stand for the overnight return,

Decomposition of ARV.
In the practical financial markets, the price volatility of financial asset is not continuous but containing jumps because of the influence aroused by information shock on the market and the investors' irrational behavior.To separate the discontinuous jump variation out, Barndorff-Nielsen and Shephard [21,22] proposed the realized bipower variation (RBV), that is, where  1 = (  ) = √/2,   is a random variable which is in standardized normal distribution, and /( − 2) is the amendment to sample capacity.According to the research of Barndorff-Nielsen and Shephard, the difference value between ARV  and RBV  is just the consistent estimate of the discontinuous jump variation when  → ∞, that is, In limited sample capacity, the discontinuous jump variation calculated with the above expression cannot be all nonnegative numbers.Hence, to guarantee the nonnegative character of the discontinuous jump variation, we define the discontinuous jump variation   as In the process of calculating the discontinuous jump variation, if the daily frequency of extracting sample data is different, it may lead to different calculation errors.To improve the accuracy of calculating the discontinuous jump variation, it is necessary for us to introduce some statistics to test the significance on the discontinuous jump variation.
We adopt the statistics   which is extracted by Barndorff-Nielsen and Shephard [21,22] on the basis of bipower variation theory to distinguish the discontinuous jump variation.The expression of statistics   is defined by →  (0, 1) , where The calculation of traditional RBV is greatly correlated with the sampling frequency.Therefore, with the increase of sampling frequency, the estimate value of RBV cannot converge to integrated volatility because of the influence of factors like microstructure of the market.Thus, adopting RBV as the robust estimator to test the discontinuous jump variation contains errors in itself.We thus adopt a brand-new estimator MedRV  which is proposed by Andersen et al. [ By calculating the statistics   after replacing RBV  with MedRV  , and RTQ  with MedRTQ  in expression (6), when the significance level is 1 − , we get the estimate value of discontinuous jump variation as The estimator of continuous sample path variation is We need to choose appropriate confidence level  in the calculating process.In this paper, we choose the confidence level  at 0.99 according to previous studies.In addition, with the above test of the statistics   and bipower variation theory, we can get the estimator of both the continuous sample path variation   and discontinuous jump variation   of the return volatility in financial markets.Based on this, we can establish models to make empirical researches on both   and   in the return volatility to forecast the future volatility in financial markets.[16] first proposed the momentum effect, and then many scholars made studies on it from different perspectives, in which the research of Grinblatt and Han [17] is a representative.Grinblatt and Han proposed the capital gain overhang when studying the momentum effect, which can be used to study the influence of gains or losses in previous phases on the return and volatility in current phase or future market.Grinblatt and Han defined the capital gain overhang   as:   = ( −1 − RP  )/ −1 (where  −1 is the closing price in phase  − 1; RP  is investor's reference price in phase ).However, most of literature (like Frazzini [18]) afterwards usually defined   as   = (  − RP  )/  ; thus this paper also defines   as   = (  − RP  )/  .

Momentum Effect. Jegadeesh and Titman
The choice of reference price RP  is very crucial when using the capital gain overhang to study the momentum effect.When Grinblatt and Han [17] proposed the capital gain overhang, they used the weighting average value of the stock in the past 260 weeks as reference price.In this paper, as the influence of three kinds (short term, medium term, long term) of investors on the volatility of Chinese stock market is to be considered, and each kind of investors chooses different reference prices.Therefore, that we choose the weighting average value of the stock in the past 260 weeks as a reference price does not fit our study.In stock market, there are different investors buy and sell stocks in every phase, and there is a great deal of information arriving at the market which will certainly affect investors' behaviors and decisions in every phase, so the reference price for each kind of investors should be changeable in every phase, that is, a dynamic price.Besides, the choice of reference price should consider not only the theoretical rationality, but also sufficient practical operations of investors in their investing processes.Therefore, we propose a series of new reference prices according to the expression of 5-day, 5-week (25 days), and 5-month (110 days) moving average, this is, The expression is a 5-day moving average when  = 5, which shows the reference price for short-term investors.When  = 25, it is a 5-week (25 days) moving average, representing the reference price for medium-term investors; when  = 110, it is a 5-month (110 days) moving average which shows the reference price for long-term investors.The moving average is an important trend indicator in security technical analysis.
In stock investing, investors will make analyses on these trend curves and decide whether to buy or sell their stocks.In trend analysis, investors usually focus on the corresponding reference prices of moving average, among which those of the 5-day, 5-week (25 days), and 5-month (110 days) moving average are relatively more concerned.These three reference prices are closely related with investors' investment and are updated every phase; thus using them as reference prices for the short-term, medium-term, and long-term investors on the whole stock market is reasonable.

The HAR-ARV Model.
According to the Heterogeneous Market Hypothesis proposed by Müller et al. [10], Corsi [9] pointed out that the different participants are likely to settle for different prices and decide to execute their transactions in different market situations; hence they create volatility.He categorized the market volatility into the short-term, medium-term, and long-term ones, in which the shortterm volatility referred to volatility brought about by the short-term investors' daily or more frequent trading; the medium-term volatility referred to volatility aroused by the medium-term investors' weekly trading; the long-term volatility referred to volatility brought about by the long-term investors' monthly trading or trading every several months.Based on this, Corsi [9] set up a volatility forecasting model according to the long memory character of market volatility, that is, the HAR-RV model.It was defined as We substitute ARV for RV and get the HAR-ARV model: where shows the monthly ARV in phase .The model mainly reflects that the market volatility is a complexly mixed volatility mingled by different volatility, which is the combined result of short-term, medium-term and long-term, investors' trading behaviors.
Corsi [9] found that the logarithm of ARV sequence is more approximate to normal distribution than the original ARV sequence.Thus, we start from the robustness and volatility forecasting accuracy of the model and change model (13)

Construction of the HAR-CJ-M Model. The basis of constructing HAR-ARV model is the Heterogeneous Market
Hypothesis.The Heterogeneous Market Hypothesis is also a key hypothesis in Behavioral Finance Theory.According to Behavioral Finance Theory, we can know that financial markets are not always effective, and the investors' irrational behaviors produce certain influence on the volatility of financial markets.Therefore, when studying the volatility of financial markets, it is necessary to consider the influence of investors' irrational behaviors on volatility.Grinblatt and Han [17] and Frazzini [18] found that the disposition effect made stock price inadequate in reflecting information, and the momentum effect emerged.Accordingly, the previous gains and losses became positively correlated with the current gains and losses, respectively.Therefore, the momentum effect plays a part in the rise and fall of the market, thus increasing the volatility of stock markets.In accordance with Here, we still substitute ARV for RV and divide ARV into  and , then introduce the three   to the above three models, then we get three new models, that is, ),    is the daily capital gain overhang in phase , which can affect the trading decisions of short-term investors and can also produce certain momentum effect, thus affecting the shortterm market volatility.Therefore, the above three kinds of capital gain overhang   can all produce the momentum effect and affect the volatility of the whole market.   ,    ,    ,    ,    , and    are defined by The volatility innovations ε +1  , ε +1 , and ε +1 are all contemporaneously and serially independent zero-mean nuisance variables.
According to Corsi's research [9], the composite model (18a), (18b), and (18c), σ +1 can be defined by with  +1 = ε +1 −   +1 .According to Andersen et al. [12], we adopt similar method of their disposal in changing   into logarithm form for those independent variables with   in model (21), that is, to change the nonnegative parts into logarithm form ln(    + 1) and the negative parts into logarithm form ln(−  n  + 1).Consequently, with model (21) 1 that the ARV  sequence shows an obvious sharp peak and fat tail which is not normally distributed, which shows the extent of volatility in Chinese stock market is great.Besides, the ADF test shows that every sequence refuses obviously the hypothesis of existence the unit root at confidence intervals of 90%, so it can be concluded that every sequence is steady.Thus further modeling analysis can be made.
In Figure 1  conclusions of previous studies.In addition, from correlation functions between ARV +ℎ and other 8 variables, we can find that all function values in future 25 phases are greater than 0, so all the past values of these variables contain some forecast information towards the future ARV  in Chinese stock market.However, the correlation function value of   and       to ARV +ℎ is very small, which shows that these two variables have relatively weaker forecasting performance on the future ARV  in Chinese stock market.Based on the above analyses, it can be seen that the capital gain overhang   in Chinese stock market carries with it provides more information of forecasting the future ARV  .Therefore, we can roughly judge that introducing the momentum effect (capital gain overhang) in the HAR-ARV-CJ model can improve the model's forecasting performance of the future ARV  in Chinese stock market.

Parameter Estimation.
To show the superiority of measuring volatility in Chinese stock market of the new model (HAR-CJ-M model) in this paper, we first estimate the parameters in the HAR-CJ-M model, and also to that of HAR-ARV and HAR-CJ model for comparisons (the HAR-ARV-CJ-M, HAR-ARV, and HAR-CJ models mentioned here and that followed are all logarithm forms, that is, model (22), model ( 14), and model ( 16).)As the HAR-type models mainly focus on different market participations of different frequency in daily, weekly, and monthly markets when considering the heterogeneous character of the market, this paper chooses three values for  (1, 5 and 22), namely, ARV  +1 , ARV  +5 , and ARV  +22 represent, respectively, the ARV of future 1-day, 1-week, and 1-month in Chinese stock market.Standard OLS regression is consistent and normally distributed, but when multistep ahead forecast is considered, the presence of regressors, which overlap, makes the usual inference no longer appropriate.Therefore, we estimate above models by OLS with Newey-West covariance correction.
The estimation results of the HAR-CJ-M model are shown in Table 2.When forecasting future 1-day, 1-week, and 1-month ARV in Chinese stock market, coefficients of the daily continuous sample path variation ln(   ), weekly continuous sample path variation ln(   ), and monthly continuous sample path variation ln(   ) in phase  are all obviously positive at significance level of 1%.It shows that the past continuous sample path variation in Chinese stock market contains forecasting information on the future ARV.However, the coefficient of the daily discontinuous jump variation ln(   ) in phase  is only significant when forecasting the future 1-day ARV, while neither the coefficient of the weekly discontinuous jump variation ln(   ) nor that of the monthly discontinuous jump variation ln(   ) is significant.Therefore, the discontinuous jump variation in Chinese stock market is weak in forecasting the future ARV.For the newly added the momentum effect factor (capital gain overhang   ) in the HAR-CJ model, except that the coefficient of the nonnegative part of daily capital gain overhang       is not significant when forecasting the future 1-week and 1-month ARV, the rest of coefficients of   are all obviously positive at significance level of 10%.This shows that the information contained in the capital gain overhang   in Chinese stock market has good forecasting performance on the future ARV.In this paper, we consider CSI 300 as a stock portfolio, and then we can use the momentum effect to explain part of the estimation results of the HAR-CJ-M model.We know from Grinblatt and Han's research that the momentum effect leads to the positive correlation between the previous gains and losses (which is expressed by the capital gain overhang   ) of CSI 300 and current gains and losses, respectively; hence the momentum effect helps in the rise and fall of CSI 300 and adds to its volatility.Therefore, the nonnegative part of past capital gain overhang in Chinese stock market is positive correlation with the future ARV, and negative correlation with the negative part, and can help with the forecasting on the future ARV to some extent.We make further analysis on the capital gain overhang of different phases (daily, weekly, and monthly), the daily capital gain overhang    can represent the behaving characters of short-term investors in phase  in Chinese stock market, and the reference price of shortterm investors is the 5-day moving average RP   .When the price in phase  is higher than    (namely,    > 0), the disposition effect suppresses further rise of the stock price; when the price in phase  is lower than RP   (namely,    < 0), the disposition effect suppresses further fall of the stock price, thereupon the stock price reflects insufficient information of phase t; thus the momentum effect emerges.After phase t, the market gradually begins to reflect the previous information, so the momentum effect helps in the rise and fall of the market and increases the market volatility.Hence, the nonnegative part of the daily capital gain overhang       is positive correlation with the future ARV, and the negative part of capital gain overhang       is negative correlation with the future ARV.We can see from Table 2 that the value of   is obviously greater than that of   , and   is not significant when forecasting the future 1-week and 1-month volatility.It means that short-term investors in Chinese stock market hold different attitudes towards the same amount of gains and losses in previous phases.The influence of previous losses on short-term investors is obviously greater than that of gains, which may be caused by the loss aversion of short-term investors.Similarly, the momentum effect can be adopted to explain the forecasting performance of the weekly capital gain overhang    and monthly capital gain overhang    on the future ARV in Chinese stock market.Different from the daily capital gain overhang    , coefficients of the nonnegative part and negative part of both the weekly capital gain overhang    and monthly capital gain overhang    are, approximately, showing that the medium-term and long-term investors in Chinese stock market are basically the same in their attitudes towards the same amount of gains and losses in previous phases, and their loss aversion is not obvious.This also reflects that medium-term and long-term investors are more rational than short-term ones.
The estimation results of the HAR-ARV and HAR-CJ models are shown in Tables 3 and 4, respectively.With analysis of the estimation results in Table 3, we find that coefficients of the daily ARV (ln(ARV   )), the weekly ARV (ln(ARV   )), and monthly ARV (ln(ARV   )) in phase  are all positive at significance level of 1% when the model forecast the future 1-day, 1-week or 1-month ARV in Chinese stock market.This shows that ARV in Chinese stock market has strong long memory character, and the past volatility contains forecasting information of future volatility.Meanwhile, it also shows that the volatility in Chinese stock market is affected by the past different volatility components.Different volatility components are produced by investor behaviors with different holding terms (short-term, medium-term, and longterm).This result also proves the existence of heterogeneous investors in Chinese stock market, which is in line with the Heterogeneous Market Hypothesis.With analysis of the estimation results in Table 4, when forecasting the future 1day, 1-week, and 1-month ARV in Chinese stock market, it can be seen from the significance level of coefficients of ln(   ), ln(   ), ln(   ), ln(   ), ln(   ) and ln(   ) that the continuous sample path variation has good forecasting performance on the future ARV, while the discontinuous jump variation component has weak forecasting performance on the future ARV.It is in line with the analysis conclusion from the HAR-CJ-M model.
Comparing the adjusted coefficient of determination dj- 2 of the HAR-CJ-M, HAR-ARV, and HAR-CJ models, we find that dj- 2 of the HAR-CJ-M model is obviously greater than that of the HAR-CJ and HAR-ARV models.
When the three models measure ARV at future 1-day, 1-week, and 1-month, dj- 2 of the HAR-CJ-M model is 0.0356, 0.0510, and 0.0775 higher than that of the HAR-CJ model, respectively, and 0.0582, 0.0719, and 0.0825 higher than that of HAR-ARV model respectively.This shows that the past capital gain overhang in Chinese stock market contains much information of forecasting the future ARV.
4.3.Robustness to Models.This paper adopts the method of Grinblatt and Han [17] to give explanation to the momentum effect, in this way, the choice of reference price in the capital gain overhang can make great influence on the study of the momentum effect.So the choice of reference price is crucial in this paper.In the empirical evidence above, we take the 5-day, 5-week (25 days), and 5-month (110 days) moving average as the reference price for those short-term, medium-term, and long-term investors in Chinese stock market, respectively.Here we will adopt the 10-day, 10-week (50 days), and 10month (220 days) moving average of CSI 300 in Chinese stock market as the reference price to do the robustness tests to the result in Section 4.2.The evaluation result of the HAR-CJ-M model is shown in Table 5, most of the coefficients of the capital gain overhang    are significant, showing that the past capital gain overhang in Chinese stock market is helpful in forecasting the future ARV to some extent.Moreover, dj- 2 of the HAR-CJ-M model which takes the 10-day, 10week (50 days), and 10-month (220 days) moving average of CSI 300 in Chinese stock market as the reference price is obviously greater than that of the HAR-CJ and HAR-ARV models, which accords with the result in Section 4.2.
However, its dj- 2 is smaller than that of the HAR-CJ-M model which takes the 5-day, 5-week (25 days), and 5month (110 days) moving average as the reference price.This shows that the 5-day, 5-week (25 days), and 5-month (110 days) moving average affects more of the decision-making behaviors of those short-term, medium-term, and long-term investors in Chinese stock market.Therefore, adopting the 5day, 5-week (25 days), and 5-month (110 days) moving average as the reference price to forecast the future ARV in Chinese stock market is more suitable.7.
In Table 7, it can be found that except that the MAPE of HAR-CJ-M model is greater than that of HAR-ARV-CJ model, and that of HAR-ARV-CJ model greater than HAR-ARV model when forecasting the 1-week ARV, the rest values of MAE, MAPE, RMSE, HRMSE, and Theil coefficient of HAR-CJ-M model are all smaller than those of HAR-ARV-CJ model, and the MAE, MAPE, RMSE, HRMSE and

Conclusion
Considering the crucial role of the overnight return variance in volatility of the whole Chinese stock market, we convert RV into ARV and set up a HAR-CJ-M model on the basis of the HAR-CJ model and momentum effect.After that, we take the 5-minute high-frequency data of CSI 300 as samples for empirical evidence and estimate parameters on the HAR-CJ-M, HAR-ARV, and HAR-CJ models.Then we compare these three models of their forecasting performance of the future ARV in Chinese stock market by using the loss functions.
In the HAR-CJ-M model, most coefficients of the momentum effect (capital gain overhang) of different term limits (daily, weekly, and monthly) are significant, showing that the irrational behaviors of different kinds of investors in Chinese stock market help in forecasting the future volatility to some extent.In addition, from the estimate results of this model and the HAR-CJ model, we can see that the past continuous sample path variation in Chinese stock market can help with the forecast of future volatility, while the past discontinuous jump variation has very poor forecasting performance, which is in line with the conclusion of Wang et al. [13].The estimate results of the HAR-ARV model show that the volatility of Chinese stock market can be influenced by the past different volatility components, and different volatility components are produced by behaviors of investors with different holding term limits (short-term, medium-term, and long-term).Thus, this result also proves the existence of the heterogeneous character of Chinese stock investors, which accords with the Heterogeneous Market Hypothesis.Besides, the comparative analysis of the above three models' forecasting performance shows that the HAR-CJ-M model which has added the momentum effect forecasts much better than the other two models on the future volatility of Chinese stock market.Therefore, it shows that the irrational factors of investors do affect the volatility of Chinese stock market.Based on this, the volatility model which has taken the irrational factors of investors into consideration can forecast better on the volatility of Chinese stock market, and the HAR-CJ-M model is more favorable to the study of practical problems such as financial risk measuring, asset pricing, and financial derivatives pricing.Although the HAR-CJ-M model has good forecasting performance on future volatility in Chinese stock market, its dj- 2 is all smaller than 0.7 when it forecasts the future 1-day, 1-week, and 1-month volatility in Chinese stock market.So it is necessary to further improve the accuracy of the model's forecasting volatility of Chinese stock market.Our work will be paid more consideration into irrational factors of investors on the basis of this paper so that further improve the forecasting accuracy of the model for the volatility in Chinese stock market.

4. 4 .
Forecasts 4.4.1.In-Sample Forecasts.Figures 2(a), 2(b), and 2(c) contain three in-sample forecast volatility sequences that are obtained by the HAR-CJ-M, HAR-ARV, and HAR-CJ models and a real volatility sequence.We adopt the loss functions to evaluate the volatility forecasting performance in Chinese stock market of the HAR-CJ-M, HAR-ARV, and HAR-CJ model.We mainly choose four loss functions to evaluation.They are the mean absolute error (MAE), mean absolute percentage error (MAPE), root mean squared error (RMSE), the heteroskedastic adjusted root mean squared error (HRMSE), and Theil coefficient.The smaller the values of these four loss functions are, the better the forecasting performance of the volatility models in future Chinese stock market is.The MAE, MAPE, RMSE, HRMSE and Theil coefficient for the in-sample forecasts from each of the three different models based on the data over the full sample period are reported in

Figure 2 :
Figure 2: (a) Comparison of the in-sample forecasting performance of the HAR-ARV, HAR-CJ, and HAR-CJ-M models (1 day).ARV represents the true volatility; HAR-ARV, HAR-CJ, and HAR-CJ-M represent the forecast volatility of the HAR-ARV, HAR-CJ, and HAR-CJ-M models, respectively.(b) Comparison of the in-sample forecasting performance of the HAR-ARV, HAR-CJ and HAR-CJ-M model (1 week).ARV represents the true volatility; HAR-ARV, HAR-CJ, and HAR-CJ-M represent the forecast volatility of the HAR-ARV, HAR-CJ, and HAR-CJ-M models, respectively.(c) Comparison of the in-sample forecasting performance of the HAR-ARV, HAR-CJ, and HAR-CJ-M model (1 month).In the figure, ARV represents the true volatility; HAR-ARV, HAR-CJ, and HAR-CJ-M represent the forecast volatility of the HAR-ARV, HAR-CJ, and HAR-CJ-M models, respectively.

Figure 3 :
Figure 3: (a) Comparison of the out-of-sample forecasting performance of the HAR-ARV, HAR-CJ, and HAR-CJ-M model (1 day).ARV represents the true volatility; HAR-ARV, HAR-CJ and HAR-CJ-M represent the forecast volatility of the HAR-ARV, HAR-CJ, and HAR-CJ-M models, respectively.(b) Comparison of the out-of-sample forecasting performance of the HAR-ARV, HAR-CJ, and HAR-CJ-M model (1 week).ARV represents the true volatility; HAR-ARV, HAR-CJ, and HAR-CJ-M represent the forecast volatility of the HAR-ARV, HAR-CJ, and HAR-CJ-M models, respectively.(c) Comparison of the out-of-sample forecasting performance of the HAR-ARV, HAR-CJ, and HAR-CJ-M model (1 month).ARV represents the true volatility; HAR-ARV, HAR-CJ, and HAR-CJ-M represent the forecast volatility of the HAR-ARV, HAR-CJ, and HAR-CJ-M models, respectively.
into  and  with the method mentioned in Section 2.2, and we get the HAR-CJ model, that is, ARV  + =  0 +      +      +      +      +      +      +  + , +   ln (   ) +   ln (   + 1) +   ln (   + 1) +   ln (   + 1) +  + .(16) [9]nblatt  andHan's research, we adopt the capital gain overhang   to measure the return and loss in, previous market in this paper.Meanwhile, considering the difference in previous gains and losses for the short-term, medium-term, and longterm investors, we divide   into three kinds (daily, weekly, and monthly) in accordance with the constructing thought of HAR-ARV model.Moreover, as the ARV sequence is a positive sequence, and there are positive and negative values for the   sequence, to consider different influence of the previous gains and losses on the current or future volatility, we divide the   sequence into a nonnegative sequence and a negative sequence.According to the way of deducing the HAR-RV model by Corsi[9], we suppose short-term investors are influenced by the long-term volatility while long-term investors are not influenced by the short-term volatility.We define a partial volatility σ⋅ , where σ  means the short-term (1-day) +      +      +      +      +      +         +         +        +      +      +      +      +      +      +         +         +         +         + +         +  +1 being changed into logarithm form and forecast period being extended to  phase, we can get the logarithm form of HAR-CJ-M model, that is, ln (ARV  + ) =  0 +   ln (   ) +   ln (   ) +   ln (   ) Data and Summary Statistics.CSI 300 is the component stock index which is made from 300 samples that are well chosen from Shanghai and Shenzhen stock markets.It covers about 60% stock values of Shanghai and Shenzhen stock markets, and its daily correlation coefficient to Shanghai and Shenzhen stock indexes reaches 98.4% and 97.6%, respectively.So it can well represent the operation state of Chinese stock market.Inaddition, the daily sample data extracting frequency also greatly affects the result of the study.On one hand, low frequency of extracting cannot reflect well the volatility information of that day.On the other hand, high frequency may lead to micronoise and affect the result.As a result, we take both the influences into consideration, refer to previous studies of different scholars, and use CSI 300 with 5-minute high-frequency data as samples to study the volatility in Chinese stock market, the data comes from the WIND financial database.The sample period begins on April 20, 2007, and ends on April 20, 2012.There are 1199 trading days and 58751 effective data altogether.The variables needed in this paper like ARV  and   are all disposed by Matlab 7.0 or Excel 2003.By dealing with and calculating the above-mentioned 58751 data, we find that the overnight return variance  2 , in Chinese stock market makes up 26.4% of the whole market volatility, namely,  2 , /(RV  +  2 , ) equals 0.264.Upon that, the overnight return variance should be considered in calculating RV of Chinese stock market.So the adjustment of RV in the paper is necessary.Table1is the descriptive statistical results of the daily adjusted realized volatility ARV  , the daily continuous sample path variation   , the daily discontinuous jump variation   , the nonnegative part of daily capital gain overhang in Chinese stock market.We can see from Table

Table 1 :
, ARV, , , gdp, gdn, gwp, gwn, gmp, and gmn, respectively, represents ARV  ,   ,   ,   +ℎ with   as a function of ℎ, with   being ARV  itself,   ,   ,       ,       ,       ,       ,       , and       .Seeing from the correlation function between ARV  and ARV +ℎ (namely, the autocorrelation function of ARV  ), we can find that ARV  in Chinese stock market has obvious long memory character.Thus, the past ARV  has certain forecast effect on future ARV  , which is in line with the Descriptive statistics for CSI 300., * * , and * in the table mean obvious at significance level of 1%, 5%, and 10%, respectively, same for the following table.

Table 2 :
Results of parameter estimation for HAR-CJ-M model.

Table 3 :
Estimation results of the HAR-ARV model.

Table 4 :
Estimation results of the HAR-CJ model.

Table 5 :
Estimation results of the HAR-CJ-M model.

Table 6 :
In-sample forecast statistics.

Table 7 :
Out-of-sample Forecast Statistics.