The Dynamic Cross-Correlations between Mass Media News , New Media News , and Stock Returns

We investigate the dynamic cross-correlations between mass media news, new media news, and stock returns for the SSE 50 Index in Chinese stock market by employing the MF-DCCA method. The empirical results show that (1) there exist power-law crosscorrelations between two types of news aswell as between news and its corresponding SSE 50 Index return; (2) the cross-correlations between mass media news and SSE 50 Index returns show larger multifractality and more complicated structures; (3) mass media news and newmedia news have both complementary and competitive relationships; (4) with the rollingwindow analysis, we further find that there is a general increasing trend for the cross-correlations between the two types of news as well as the cross-correlations between news and returns and this trend becomes more persistent over time.


Introduction
According to the efficient market hypothesis, the security market could reflect the information instantly.However, there are many anomalies showing that the exogenous information plays an important role in the stock market.News, for example, as one type of the exogenous information, has been intensively investigated for its influence on the stock market, including the relation between the news and stock prices [1,2], stock returns [3], trading volumes [4], investors' behavior [5,6], and the correlation between the inherent sentiment behind the news and the stock market [7].With the development of the Internet, the web has become one of the most important sources of news.Therefore, the news is classified into traditional mass media news and new media news according to the source of the news (as for the definition, see https://en.wikipedia.org/wiki/Newmedia and https://en.wikipedia.org/wiki/Massmedia).The new source of the news brings new features to the news.Through the Internet, the new media news diffuses more quickly and can be easily obtained more conveniently.Compared to the new media news, the mass media news is more rigorous and we can get more insightful ideas through the mass media news.So the emergence of the new source of news leads to new directions for the financial research.One could be the correlation of the different news' sources, especially the correlation among them when they are referred to as the factors influencing the stock market, as well as the different roles they play in the stock market.Since the amount of daily news can be seen as the intensity of the information [8], it is meaningful to explore the correlation of the amount of news.
The most commonly used measure of correlation is the Pearson correlation.To use this measure, the time series must be stationary and follow the Gaussian distribution.But the news amount series in stock market are not all the series with stationarity and Gaussian distribution.In order to overcome these limits, existing literatures have engaged in exploring measures to investigate long-range cross-correlation between two time series with nonstationary and non-Gaussian distribution.Based on the method of detrended fluctuation analysis (DFA) introduced by Peng et al. [9] focusing on the long-range autocorrelation of the DNA series, Podobnik and Stanley [10] proposed the method of detrended cross-correlation analysis (DCCA) to explore the long-range cross-correlation between two nonstationary time series.Then Zhou [11] introduced the multifractal detrended cross-correlation analysis (MF-DCCA or called MF-DXA) to investigate the cross-correlation by considering

News and Stock Market Performances.
Existing literature on the relationships between financial news and stock market behavior can be divided into two categories.The first refers to the investigation on the impacts of mass media news, that is, news extracted from newspapers, television and advertising, returns, trading volume, liquidity, and volatility [1,4,7,[12][13][14][15][16][17].We mainly find that the majority of these literatures focus on earnings announcement, the number of the headlines, and the expenditures on advertising, but only a few employ the information content.The second refers to the investigation on the relationships between new media news and return predictability as well as market dynamics [18][19][20][21][22][23][24][25][26][27][28][29][30][31].This line of work has shifted research interests away from building more complicated models to attach more importance to data and its impacts on market dynamics.

MF-DCCA.
The existing literatures about the detrended cross-correlation analysis can be approximately divided into two parts.One engaged to extend and modify the methodology, and a series of extended methods have been introduced, such as the methods of MF-X-DMA [32], MF-DHA [33], and MF-X-PF [34].These methods all improve the efficiency of the method of MF-DCCA to some extent.Another part of literatures concentrated on the cross-correlation between the time series in the financial market or across the markets.Wang et al. [35] found that the multifractal analysis significantly cross-correlates between Chinese Ashare and B-share by the method of MF-DCCA.Cao et al. [36] investigated the multifractal detrended cross-correlation between the Chinese exchange market and stock market drawing the similar conclusions.Gu et al. [37] used the MF-DCCA to prove that there is different performance for the cross-correlation between the multifractality and the market efficiency before and after the equity division reform.Lu et al. [38] investigated the dynamic relationship between Japanese Yen exchange rates and market anxiety, finding that the crosscorrelation exhibits different volatility.Li et al. [39] studied the cross-correlation between the crude oil and exchange rate markets finding that their cross-correlation are sensitive to sudden events.

Data Description
We get daily news data from the column of Security Information and News in RESSET financial database in China in this paper.The stocks corresponding to the news sample are the constituent stocks of the Chinese Stock Market Index 50 (SSE 50).The daily returns of SSE 50 are also obtained from the RESSET financial database in China.The stocks of SSE 50 are the most active stocks with the highest market value and the highest liquidity in Chinese stock markets.So they can represent the Chinese stock market well and have enough amount of news to investigate the cross-correlation between the two types of news and the cross-correlation between news and return.Since the constituent stocks are adjusted every half year, we collect all the stocks once they are selected as the constituent stocks of SSE 50.And then we removed the stocks which are delisted.The whole sample period extends from January 1, 2011 to December 31, 2016 with daily observations.And the total number of the stocks used in this paper is 532 across this period.
We classify the news into mass media news and new media news based on the selected process used by Zhang et al. [40] according to the source names of the news.After conducting the selected process, the numbers of mass media sources and new media sources are 265 and 334, respectively.Since the constituent stocks of SSE 50 vary over time and our aim is to investigate the cross-correlations between the amount of news and the return of SSE 50, the daily news used in this paper is just the news of the stocks belonging to SSE 50 on that day. Figure 1 illustrates the daily amount of the two types of news from January 2011 to December 2015.The red line is the amount of daily mass media news (MM) and the blue line is the amount of daily new media news (NM).We can find a stable trend for the amount of mass media news across the sample period, while there is an increasing trend for the amount of new media news on the whole.And the quantitative relationship of the two types of news reverses around the year of 2013.After 2013, the amount of new media news exceeds the amount of mass media news.

Empirical Methodology
To explore the cross-correlation between two nonstationary time series, various approaches have been introduced in existing literatures.In this paper, We followed the methodology of multifractal detrended cross-correlation analysis introduced by Zhou [11].Consider any two time series {  } and {  },  = 1, 2, . . ., , that have the equal length .Then the procedure of the MF-DCCA can be described as follows.
Step 2. Divide the two profiles into   = [/] nonoverlapping windows of equal length , respectively.If the length  of the time series is not a multiple of the considered time scale , there will remain a short part segment at the end of each profile.In order to avoid this, we repeated the same divided procedure starting from the end of the profile.And then we got 2  segments of each profile.  is set as 10 <   < /4 and the step is 1 in this paper.
Step 3. Estimate the local linear trend of each segment through the OLS method.
Step 4. Get the th-order fluctuation function by average over all segments: Particularly, for  = 0, the equation is defined by Step 5. Analyze the scale behavior of the fluctuation function through observing the log-log plots of   () versus .If two series are long-range cross-correlated, we could get a powerlaw relationship as follows: ℎ  () can be obtained through the slope of the loglog plots of   () versus .And we can calculate the slope using the method of OLS.The value of ℎ  () indicates the cross-correlation of the two time series.If ℎ  () > 0.5, the cross-correlation between them is persistent (positive).When ℎ  () < 0.5, there is an antipersistent (negative) crosscorrelation between the two time series.And for ℎ  () = 0.5, the two time series are not cross-correlated with each other.In particular, when  = 2, the scaling exponent ℎ  () is the generalized Hurst exponents.

Cross-Correlation Test.
Before investigating the pairwise cross-correlation among the two types of news amount series and the return series of the index, a cross-correlation test using the method proposed by Podobnik and Stanley [10] is conducted firstly.And the test statistic is defined as Here,  2  is the cross-correlation function and is defined as follows: where {  } and {  } are two time series with the equal length .
The test statistic  cc () is approximately  2 () distributed with  degrees of freedom.The null hypothesis of this cross-correlation test is that none of the first  crosscorrelations is different from the value of  2 ().Thereby, if the value of the cross-correlation test exceeds the critical value of  2 (), the cross-correlation between the two time series is significant.
Figure 2 shows the result of the test statistic  cc ().And the red line is the critical value of  2 () at 5% significant level.The other lines represent the values of  cc () between the amount of the mass media news and the amount of the new media news, the amount of the new media news and the return of the SSE 50, the amount of total news and the return of SSE 50, and the amount of the mass media news and the return of SSE 50 from the top to the bottom of the figure.It is clear that all the values of  cc () exceed the critical value, so the null hypothesis is rejected and there is a long-range crosscorrelation among any pair of these series.cross-correlation qualitatively, the MF-DCCA is conducted to investigate the cross-correlation quantitatively.The scale  is set from −10 to 10 and the step is 1.Figures 3-6 are the loglog plots of   () versus  for the amount of MM news and the amount of NM news, the amount of MM news and the return of SSE 50, and the amount of NM news and the return of SSE 50, as well as the amount of total news and the return of SSE 50.As we can see in Figures 3-6, although there are fluctuations in some lines, they all fit the log-log line of   () versus  well at 1% significant level.The significant level can be seen from Table 1.These linear curves provide evidence for the existence of the power-law cross-correlation between pairs of these series.

Multifractal Detrended
Figure 7 shows the relationship between the scaling exponents and .The variance of the exponent with  indicates that a multifractality exists in the cross-correlation between the pairwise series among the news and the return.Particularly, for the cross-correlations between the MM news amount and the NM news amount, the MM news amount and the return of SSE 50, and the total amount of news and the return of SSE 50, the scaling exponents for  < 0 are larger than those for  > 0. So we can conclude that the crosscorrelations for small fluctuations are more persistent than the ones for the large fluctuations.For the cross-correlation between the NM news amount and the return of SSE 50, the "normal" fluctuations display the most persistent crosscorrelation.Table 1 reports the scaling exponent for these time series when  is even (the conclusion is similar when  is odd).The scaling exponents in Table 1 are all larger than 0.5, so the cross-correlations between the amount of MM news, the amount of NM news, the amount of total news, and the return of SSE 50 are persistent.In order to explore the degree of multifractality, the measure, Δ  , is introduced as follows [41]: where the larger the Δ  is, the higher the degree of multifractality is.The last line of Table 1 shows the value of Δ  for the four cross-correlations.Δ  for MM-NM indicates a strong multifractality for the cross-correlation between the amounts of two types of news.The larger Δ  for the cross-correlation of MM-Return than the one of NM-Return shows the stronger multifractal characteristics and more complicated structure for the cross-correlation between the amount of mass media news and the return of SSE 50.
Table 1: Results of the MF-DCCA scaling exponent.This table reports the MF-DCCA scaling exponent between the amount of mass media news, the amount of new media news, the amount of total news, and the return of SSE 50.The symbol "MM-NM" denotes the scaling exponent between mass media news amount and new media news amount.The symbol "MM-Return" denotes the scaling exponent between mass media news amount and the return of SSE 50.The symbol "NM-Return" denotes the scaling exponent between new media news amount and the return of SSE 50.The symbol "TM-Return" denotes the scaling exponent between the total news amount and the return of SSE 50.
The symbol "Δ" denotes the difference between TM-Return and the sum of MM-Return and NM-return.The last column of Table 1 shows the difference between the TM-Return and the sum of MM-Return and NM-Return.The value of the last column that is larger than 0 reveals that the influence of the combination of the mass media news and new media news to the return is smaller than the sum of their respective influences for the return.This leads us to conclude that there is a sharing component between the mass media news and the new media news reflecting a competitive relationship.These findings indicate that there exists overlapped information conveyed by two information sources and thus investors need to distinguish new information for their decision-making.On the other hand, the value of TM-Return that is larger than the value of either MM-Return or NM-Return reveals the complementary relationship for them.And this result is similar to the conclusions of Zhang et al. [40].Particularly, the line of TM-Return is closed to the line of MM-Return when  < 0 and is closed to the line of NM-Return for  > 0. And this reflects the fact that the cross-correlation between the amount of mass media news and the return of SSE 50 leads to the cross-correlation between the news amount and return of SSE 50 for the small fluctuations.The cross-correlation between the amount of new media news and return of SSE 50 is the dominant factor for the big fluctuations.Generally speaking, all these findings suggest that both information sources, that is, mass media and new media, provide useful information to the financial market and influence the variations in the asset prices.

𝑞
In Figures 3-6, we can find a turning point for the linear trend of the curves.As is suggested by Podobnik et al. [42], Table 2: Short term and long term scaling exponents between news and return series.This table reports the short term and long term scaling exponents for the series between the mass media news and return of SSE 50, new media news amount and return of SSE 50, and total news amount and return of SSE 50.The signal "MM-Return" denotes the scaling exponent of mass media news amount and return of SSE 50, "NM-Return" is the scaling exponent of new media news amount and return of SSE 50, and "TM-Return" denotes the scaling exponent of total news number and return of SSE 50.Δ  is the multifractality degree."Short" denotes the term  <  * ; "long" is the term  >  * .For short term, the scale exponents are all larger than 0.5 for three pairs of series, which reflects a strong persistent crosscorrelation for them, while the long term scaling exponents show a different picture.The scaling exponents are larger than 0.5 for the small fluctuations denoting a strong persistent cross-correlation, whereas for the big fluctuations the scaling exponents are smaller than 0.5, reflecting an antipersistent cross-correlation (when  > 4 for MM-Return,  > 6 for NM-Return, and  > 6 for TM-Return).The last row denotes the multifractality of the short and long term cross-correlations for MM-Return, NM-Return, and TM-Return.For any type of the three pairs of cross-correlation, the short term Δ  is smaller than the long term ones significantly.Also, in Figures 8-10 that compare the short and long term cross-correlations for any pairwise cross-correlation of the three types of crosscorrelations, there is a steady trend for all the short term scaling exponents and the lines of the short term are nearly parallel with the -axis.So, in long term the multifractality of the scaling exponents is larger than the short term ones and the cross-correlation of the news and return of SSE 50 is more stable.Particularly, by comparing the short term scaling exponents for any pair of cross-correlation, we can find that the values of the short term scaling exponents of MM-Return, NM-Return, and TM-Return are nearly equal to each other at the same scale , whereas for the long term, the scaling exponents of TM-Return are larger than any of the two other scaling exponents but smaller than the sum of them in any scale .We attribute this to the competitive relationship in short term and complementary relationship in long term for the two types of news.So the influence of the mass media news for the return is the same as the influence of the new media for the return and the overlay of them does not increase the cross-correlations for the news and return, while for the long term, the complementary relationship for the two types of news increases the influence of the total news to the return.

Dynamic Analysis of Cross-Correlation.
In this part, we conduct the rolling window method to explore the dynamic features of the cross-correlation.And the length of rolling windows selected in this paper is 250 trading days (approximately 1 year) [35].Figure 11 shows the time varying scaling exponents for  = 2 for the amount of mass media news and the amount of new media news.Although the process is fluctuating, the general trend of   is increasing especially for the period after 2013 when the amount of new media news exceeds the mass media news.And the value of   is larger than 0.5 throughout the period.So there is a persistent cross-correlation between the amount of mass media news and the amount of new media news and it becomes more and more persistent over time.Similarly, Figure 12 shows time varying scaling exponents for  = 2 for news and return.
There is also a general increasing trend for the news amount and return and the value of   is larger than 0.5 over time.
The cross-correlations between news amount and return of SSE 50 are more and more persistent.We attribute this to the inefficient market.According to the efficient market hypothesis, the securities' price will reflect all the information [43].So if the market is not efficient, the return of SSE 50 will depend on the amount of news.To further prove this, we introduce the concept of the market inefficiency index [37]: where  is the Hurst exponent calculated by the detrended fluctuation analysis (DFA).
In the existing literatures, the Hurst exponent is regarded as the measure of the market efficiency [44][45][46][47].When Hurst exponent is equal to 0.5 (the corresponding inefficient indices (EI) in this paper are equal to 0), the market is considered to be efficient.So the larger the value of EI, the more inefficient the market is. Figure 13 shows the time varying EI with the rolling window of 250 days.In Figure 13, the value of EI is larger than 0 over time, indicating the inefficiency of the market.Moreover, the period after 2013 has the larger EI than the time before, which is also the period that has the larger scaling exponent for news amount and return of SSE 50.So we conduct the MF-DCCA to investigate the relationship between the cross-correlation for news and return and the inefficient indices.Figure 14 shows scaling exponents between cross-correlation exponent for  = 2 for news and return and market inefficiency indices.In Figure 14, the scaling exponents for cross-correlations of MM-Return, NM-Return, and TM-Return are all larger than 0.5 at any scales, which indicates a persistent cross-correlation between pairwise cross-correlations of them.Therefore, we can conclude that the news, no matter whether it is the mass media news or the new media news, plays an important role in the inefficient market of China now.

Conclusion
In this paper, we investigate the cross-correlations between two types of news (mass media news and new media news) with the stocks of SSE 50 as well as the corresponding return of SSE 50.By conducting the MF-DCCA method, we can draw the following conclusions.First, there are power-law cross-correlations between the amount of two types of news as well as between news and its corresponding returns.For the cross-correlation of two types of news and the cross-correlation between mass media news and return of SSE 50, the cross-correlations for small fluctuations are more persistent than the ones for the large fluctuations, while for the cross-correlation between the amount of NM news and the return of SSE 50, the "normal" fluctuations display the most persistent cross-correlation.The cross-correlations all perform multifractality, but the crosscorrelation between mass media news and return of SSE 50 shows larger multifractality and more complicated structure.
Second, the value of TM-Return that is smaller than the value of the sum of MM-Return and NM-Return reflects the fact that there is a sharing component between the mass media news and the new media news; hence there is a competitive relationship between them.On the other hand, the value of TM-Return is larger than the value of either MM-Return or NM-Return, which reveals the complementary relationship for them.And this result is similar to the conclusion of Zhang et al. [40].In addition, the short term multifractality of the cross-correlation between news and return is smaller than the long term one.Moreover, the short term values of the scaling exponents of MM-Return, NM-Return, and TM-Return are close to each other, while the long term value of TM-Return is larger than the value of either MM-Return or NM-Return.This leads us to conclude that there is a competitive relationship in short term and complementary relationship in long term for the two types of news.So the influence of the mass media news for the return is the same as the influence of the new media on returns and the sharing part of them does not increase the cross-correlation for the news and returns while for the long term, the complementary relationship for the two types of news increases the influence of the total news on the return.
Third, by conducting the rolling window method, we find that there is a general increasing trend for the crosscorrelation between the two types of news as well as the cross-correlation between news and return and the crosscorrelations are more and more persistent over time.We attribute this to the inefficient market.And the persistent cross-correlations between the inefficient indices and the time varying scaling exponents of the news and return indicate that the news, no matter whether it is the mass media news or the new media news, plays an important role in the inefficient market of China now.
Admittedly, the above results cannot completely reveal the relationship between the two types of news as well as the relationship between news and stock returns.To explore the role the mass media news and new media news play in the stock market more precisely, some more work, for example, herding behavior and structural breaks between two types of news [48,49], needs to be done in the future.Besides, as suggested by Cajueiro and Tabak [50], examination on the role of these two types of news on information efficiency is also a promising research direction.We leave these for future research.

Figure 1 :
Figure 1: The daily amounts of news from mass media and new media.

FxyqFigure 4 :FxyqFigure 5 :
Figure 4: Log-log plots of   () versus  for amount of MM news and the return of SSE 50.

FxyqFigure 6 :
Figure 6: Log-log plots of   () versus  for amount of total news and the return of SSE 50.

Figure 8 :Figure 9 :
Figure 8: Short and long scaling exponents for MM-Return and NM-Return.

Figure 12 :
Figure 12: Time varying scaling exponents for  = 2 for news and return.
refers to the "crossover."Through the crossover, the scale exponents reflect different features in short term ( <  * ) and long term ( >  * ).In this paper, the crossovers of the amount of mass media news and the amount of new media news, the amount of mass media news and return of SSE 50, and new media news amount and return SSE 50 as well as total news amount and return of SSE 50 are log( * ) = 2.38 (about 190 days), log( * ) = 2.33 (about 213 days), log( * ) = 2.43 (about 268 days), and log( * ) = 2.29 (about 195 days), respectively.Table2reports the short and long even scale exponents for the series of news and returns.
the point  *