Simulation and Statistical Analysis of Market Return Fluctuation by Zipf Method

We investigate the fluctuation behaviors of financial stock markets by Zipf analysis. In the present paper, the empirical research is made to describe ensembles and specifics of stock price returns for global stock indices, and the corresponding Zipf distributions are given. First we study the fluctuation behavior of global stock markets by m, k -Zipf method. Then we consider a dynamic stock price model, and we analyze the absolute frequencies and the relative frequencies for this financial model. Further, the Zipf distributions of returns for SSE Composite Index are studied for different time scales.


Introduction
In recent years, the empirical research in financial market fluctuations has been made.Some statistical properties for market fluctuations were uncovered by the high-frequency financial time series, such as the fat-tails distribution of price changes, the power-law of logarithmic returns and volume, for example, see Calvet and 6 .Their work shows that the fluctuations of price changes are believed to follow a Gaussian distribution for long time intervals but to deviate from it for short time steps, especially that the deviation appears at the tail part of the distribution, usually called the fat-tails phenomena.The statistical analysis also indicates that the tail distributions of price fluctuations follow the power-law distributions.The study on power-law scaling in financial markets is an active topic for the researchers to understand the distributions of financial price fluctuations.Meanwhile, the behaviors of stock market returns are studied by various kinds of methods, for example, Gaylord and Wellin 7 , Harris and K üc ¸ük özmen 8 , Holden et al. 9 , Ilinski 10 , Mills 11 , and Moreno and Olmeda 12 .In this paper, we will focus on the statistical properties of ensembles and specifics of fluctuations for Chinese stock prices by Zipf analysis, since many types of data studied in the physical and social sciences can be approximated with a Zipf distribution, one of a family of related discrete power law probability distributions e.g., 3, 4, 13-17 .In the present paper, firstly we select the data of daily closing prices from SSE Composite Index SSE , Shenzhen Component Index SZSE , Dow Jones, CAC 40, Nasdaq, FTSE 100, Heng Seng, and DAX during the years 1991-2010.The m, k -Zipf distributions of these prices returns are studied and plotted, and the comparisons of Zipf plots for these indices are also discussed.Note that SSE composite index is one of the most important stock indices in Chinese stock markets, this index plays an important role in Chinese financial markets e.g., 6,18 .Secondly, we consider a financial price model for the different time scales.In this model, the investors' psychological expect for the market returns is considered.By the Zipf analysis, we investigate the statistical behaviors of fluctuations for the market returns, and the corresponding empirical research is made for SSE composite index.The observed data of SSE composite index is from SSE in the 10-year period from 2000 to 2009.

m, k -Zipf Analysis for Ensembles and Specifics of Global Stock Markets
In recent years, Zipf's method has been widely applied to the literature, stock markets, computers, networks, management, oil, and many other fields.In the present paper, Zipf plot is applied to study the stock statistical properties of Chinese stock markets.The technique, known as a Zipf plot, is a plot of log of the rank versus the log of the variable being analyzed, which was firstly introduced by George Kingsley Zipf, in order to study the statistical occurrences in different languages.Let x 1 , x 2 , . . ., x n denote a set of N observations on a random variable x for which the cumulative distributions function is F x , and assume that the observations are ordered from the largest to the smallest so that the index i is the rank of x i .The Zipf plot of the sample is the graph of ln x i against ln i.From the ranking, i/n 1 − F x i i 1, . . ., n and ln i ln 1 − F x i ln n.

2.1
Thus, the log of the rank is simply a transformation of cumulative distribution function.In studying English word occurrence frequency, Zipf's law reveals that while only a few words are used very often, many or most are used rarely.It was found that if the words have the descending orders of frequency, the frequency of occurrence of each word and its symbol ranking has simple inverse relations, that is, P i ci −β .Making a transformation, the above equation can be converted into ln P i ln c − β ln i, where P i is the frequency of the word whose rank is i, and c is some positive constant.Plotting the graph by ln P i against ln i, the graph is close to a line with the slope of −β.
Recently Zipf analysis appeared as a way for quantifying time series correlations.The technique is based on translating a given time series into a sequence of symbols and counting the frequency of any word, that is, pattern of consecutive symbols.Ranking these words by their frequencies from most common to least common and plotting the logarithm of frequencies versus the logarithm of rank give us a Zipf plot.For the lowest ranks, the plotted points appear to fall along a line.The gradient of the best line fit corresponds to the Zipf exponent β, which characterizes correlations in time series.A conjecture was proposed by Czirok et al. to relate the exponent α in DFA and the exponent β in Zipf analysis, β |2α − 1|.Next we do the empirical research in global stock markets by Zipf plot.We select the daily closing price data of SSE composite index, SZSE composite index, Dow Jones, CAC40, NASDAQ, FTSE100, Heng Seng, and DAQ, where the numbers in brackets are the corresponding registered stock tickers.The database of observed data is from April 1, 1991 to April 30, 2010.A time series can be interpreted as a sequence of characters.The m, k -Zipf is a Zipf analysis based on alphabet sequence, where m is the length of the each subsequence, k is the number of different characters in the alphabets sequence.For the simplicity, we consider two characters in the alphabet, namely, u and d.A positive change in time series is replaced by u in the sequence and d is used in the other cases.Thereafter the coarse graining may be applied on a sequence in several steps to reduce undesirable effects of short range correlation.The simple rule: {uu → u, ud → d, du → u, dd → d} shows one step of the coarse graining.In the final step, the frequency of words with strictly m characters is counted.Relative frequencies are then computed by following rule: where f udu • • • u is the frequency of the sequence udu • • • u and f u and f d are the frequency of alphabet "u" and "d", respectively.According to the theory of mathematical finance 19, 20 , we have the formula of logarithmic stock price changes where S t t 1, 2, . . ., T denotes the closing stock price of the tth trade day.According to the selected data, the plots of ensembles of market returns for m, 2 -Zipf and m, 3 -Zipf are plotted in Figures 1 and 2, respectively, and the corresponding statistics of returns are given in Tables 1 and 2, respectively.Figure 1 and Table 1 exhibit the values of Zipf's exponents β for global stock markets when k 2; Figure 2 and Table 2 exhibit the values of Zipf's exponents β when k 3. Figures 1 and 2 display the fluctuations of the stock prices versus their ranks of these trade days in double logarithmic scale and show that the Zipf plots of stock prices follow the power-law distributions.The trends of all the curves in Figures 1 and 2

Modeling a Dynamic Stock Price and the Corresponding Zipf Analysis
In order to study the fluctuation of crude price changes, some crude oil price model with an expected return is introduced see 21, 22 .In this section, we consider an extended financial price model with a random environment, and the corresponding Zipf analysis is made to investigate the probability distributions, the absolute frequencies, and the relative frequencies of the financial model for the different time scales.

The Financial Price Model with Different Expected Returns and Different Time Scales
Let S t t 1, 2, . . ., n denote the time series of daily closing stock prices, and the corresponding formula of logarithmic price changes was given in 2.3 of Section 2. Next we consider the larger time scales for the market returns.Let τ be the given integer time scale, and the τ-step of logarithmic price changes in a stock market is defined by where t 1, 2, . . ., n − τ.Next we consider a new time series with a random environment which is derived from the original τ-step of logarithmic price changes of the model.Let θ be a nonnegative random variable on a probability space Ω with the probability distribution F θ x , which is called a random threshold of the model.For example, θ can be a uniform distribution on the interval 0, 2 , or θ can also be a random variable |ξ|, where ξ follows a normal distribution, and so forth.Then the new time series derived from the original stock prices is given as where u, s, and d denote "price-up", "price-stable", and "price-down", respectively.In this model, the random threshold θ represents the expected returns for the market investors.
In a real stock market, various kinds of information will affect the investing positions of the market participants, including buying positions, selling positions, and nonacting positions.In the current Chinese stock markets, stock market trading rules and management systems are changing rapidly, for example, the daily price change limit now 10% , the shareholding reformation, the direct investment at Hong Kong stock markets, the establishment of financial derivatives such as futures and options, and so forth.In another aspect, the estimate for this stock price, positive or negative news, trends, the historical data, the present financial situation, the future information, political event and economic policy, and so forth may have some effect on the investing positions of the market participants.This implies that the investing environment is changing as time goes on.Considering all of these, we introduce a random environment in the financial model as the above.Then, for the different parameters τ and θ, the fluctuation behaviors absolute frequencies and relative frequencies of the time series y τ t, θ t 1, 2, . . ., n − τ will be studied.Let n u τ, θ , n s τ, θ , and n d τ, θ denote the frequencies of occurrences for price-up, price-stable, and price-down, respectively.Then the corresponding absolute frequencies of the time series y τ t, θ for these case are given as follows: where n u τ, θ n s τ, θ n d τ, θ n−τ, and F θ x P θ ≤ x .In financial markets, the large fluctuation of daily price changes usually occurs with the small probability.Considering of this property, the frequencies of occurrences defined in the above depend on the probability distribution F θ x .Next the corresponding relative frequencies of the time series y τ t, θ are given as follows:

3.4
In the above definitions of the relative frequencies, we omit the occurrences of stable-price, thus g u τ, θ and g d τ, θ measure the total occurrences of price rising and price falling, respectively.In the following, we consider the statistical properties of the absolute frequencies and the relative frequencies for various values of the two parameters τ and θ.

The Empirical Research of the Model for SSE Composite Index
In this subsection, according to the daily closing stock price of SSE composite index from January 4, 2000 to November 20, 2009, we make the empirical research for the absolute frequency and the relative frequency of the model.By the computer science, we compute the absolute frequencies and relative frequencies for different values of τ and θ, and the corresponding plots are plotted in Figure 3.In Figure 3, the horizontal axis indicates the random expected return θ, and the vertical axis indicates the relative frequencies of the time series y τ t, θ .Figures 3 a and 3 b exhibit that the price-up function f u τ, θ and pricedown function f d τ, θ are decreasing functions when θ is increasing.And for two τ-steps τ 1 , τ 2 , such that τ 1 > τ 2 , the curve of f u τ 1 , θ is over the curve of f u τ 2 , θ , and we have the similar results to the price-down function f d τ, θ .But for the absolute frequency of price-stable f s τ, θ , Figure 3 c displays the opposite trend, that is, the function f s τ, θ is increasing with θ increasing.Figure 3 shows that, for a given step τ, the absolute frequencies reach their inflection point as θ increases.In addition, when τ increases, the corresponding value of θ where the inflection of the absolute frequencies occurs becomes larger.Moreover, in Figure 3, when τ 120, 250, the values in horizontal axis of the corresponding inflection points are significantly larger than those of the inflection points for τ 1, 5, 20, 60.Table 3 gives the values of the inflection points for the absolute frequencies and the relative frequencies.We can find out for τ 250 that the inflection points of the absolute frequencies of price-up and price-stable do not appear for θ ∈ 0, 2 .And Table 3 also exhibits that, for any time interval τ, the values of the inflection points for the five kinds of functions are divide into two groups.Figure 4 shows the distributions of the relative frequencies for different time scales and different expected returns.When θ ∈ 0, 0.1 , the relative frequencies are approximately equal to 0.5; when θ θ > 0.1 becomes larger, the relative frequencies depart from the value 0.5 rapidly.Note that the daily price fluctuation is limited in Chinese stock markets, that is, the changing limits of the daily returns i.e., τ 1 for stock prices and stock indices are between −10% and 10%.This means that θ ∈ 0, 0.1 in the present paper.According to the above discussion that the relative frequencies are near to 0.5 for θ ∈ 0, 0.1 , then we can reach a conclusion that θ θ ∈ 0, 0.1 is a low risk expected return.If a market participant hopes to obtain a return which is larger than 0.1, he will face a high investing risk.

The Zipf Distributions of Market Returns for Nonrandom Expected Return θ
In this subsection, we study the absolute frequencies of market returns for nonrandom expected return θ.Suppose that θ ∈ 0, 3 such that θ 0.01k, k 1, . . ., 300, then we consider the Zipf distributions of the absolute frequencies of price-up f u τ, θ and price down f d τ, θ .Further for the different interval times τ, we compute and plot the corresponding Zipf distributions. Figure 5 exhibits the power-law distributions for the absolute frequencies, and we make the similar fitting function as that of 2.4 in Section 2. And the corresponding β of Zipf's exponent is estimated in Table 4.According to Table 4, we find out that the slopes of the six-time scales are less than −1 except the case when τ 250.Since the trade dates of one year are about 250 trade days, so this implies that, if the time interval τ is one year or longer than one year, the volatilities of the stock prices to be expected to rise will be much less than those of the short-term of τ 1, 5, 20, 60, 120.In this case, we define the short-term time interval for τ 1, 5, 20, 60, 120 and the long-term time interval for τ 250 or τ > 250.We also notice that in a short-term time interval, the investment risk grows with τ increasing, because the standard deviation grows gradually.In Figure 5, for τ 1, 5, 20, 60, 120, 250 and for priceup and price-down cases, the log-log plots of Zipf distributions of the absolute frequencies for SSE composite index are plotted.

The Zipf Distributions of Market Returns for Different Time Scales
In the definition 3.1 of Section 3.1, the τ-step of logarithmic price changes in a stock market is defined by R τ t ln S t τ − ln S t , for t 1, 2, . ... According to the selected data of daily closing prices of SSE composite index during the period from January 4, 2000 to November 20, 2009, we investigate the Zipf distributions of the returns of SSE composite index for different time scales τ.And the Zipf distributions of the SSE index returns in different time scales are shown in Figure 6.Further, by the fitting technique which is introduced in 2.4 of Section 2, we can obtain the Zipf exponents which are given in Table 5.This empirical research exhibits that, for the short-term time interval of τ 1, 5, 20, 60, 120, the slopes of the five curves are   approximately equal to −1.Whereas for the long-term time interval for τ 250, β β 1.1883 is much larger than 1 by comparing with the short-term time interval cases.This implies that, for SSE composite index, the Zipf distribution of returns with the long-term time interval τ 250 deviates from those of returns with the short-term time intervals to some extent.In fact, the features of Figure 6 and Table 5 conform to the current situation of Chinese stock markets.Chinese stock markets are developing financial markets with a short history.After more than 20 years' reformation and opening in the capital market economy of China,  now the financial market plays an important role in the national economy.Chinese securities markets attract the investors all over the world, especially that there are a lot of retail investors in the Chinese stock market; they often focus on that whether they can get a high return in a short time, but do not focus on a long-term return.In China, we often can see that the large fluctuations of stock markets are caused by money flow on a global scale, and the large fluctuations of stock prices accompany with the large trade volume.The "herd behavior" of investors is more obviously in Chinese financial market.So these support our empirical research result, that is, the Zipf distribution of returns with τ 250 is somewhat not similar to those of returns with τ 1, 5, 20, 60, 120.

Conclusion
In the present paper, we analyze the discrete Zipf distributions of market returns for Chinese stock markets, especially for SSE composite index.We find out that the Zipf's exponents of the return distributions are around the value 1 in Chines stock markets.Further, a volatility dynamic price model with a random environment is analyzed, and the statistical behaviors of the model are investigated and displayed.We also reach a conclusion that the expected return θ ∈ 0, 1 is most reasonable for Chinese stock markets, and the volatility of stock prices is large for the short-term investment by comparing with the long-term investment.At last for Mathematical Problems in Engineering 13 SSE composite index, the empirical research exhibits that the Zipf distributions of different time scales' returns are very similar, and the corresponding exponents β of fitting functions are about 1.
also show the similar fluctuation

Figure 1 :
Figure 1: The log-log m, 2 -Zipf plots of global stock markets.The observed data is from April 1, 1991 to April 30, 2010.

Figure 2 :
Figure 2: The log-log m, 3 -Zipf plots of global stock markets.The observed data is from April 1, 1991 to April 30, 2010.

Figure 3 :
Figure 3: The evolution trends of absolute frequencies of SSE composite index for the random variable θ.The observed data of the daily closing price for SSE composite index is from January 4, 2000 to November 20, 2009.

Figure 4 :
Figure 4: The evolution trends of relative frequencies of SSE composite index for the random variable θ.The observed data of the daily closing price for SSE composite index is from January 4, 2000 to November 20, 2009.

Figure 5 :
Figure 5: The log-log plots of Zipf distributions of the absolute frequencies for SSE composite index from January 4, 2000 to November 20, 2009.

Figure 6 :
Figure 6: The Zipf plot of returns of SSE composite index for different time scales τ.

Table 1 :
The Zipf exponents of global stock markets when k 2.

Table 2 :
The Zipf exponents of global stock markets when k 3.According to Table1, when k 2, we can find that the values of β for SSE and SZSE are greater than those of other markets; this implies that Chinese stock markets fluctuate more violently than other world's major stock markets.Similarly we can discuss the fluctuation behaviors of global stock markets for k 3.
behavior; we could find that as m increasing, all of these stock indices' β are also increasing, and the corresponding fitting function is given as below:ln R α − β ln i, 2.4where R denotes the market return, i is its rank, and β is the Zipf's exponent which needs to be estimated.

Table 3 :
The values of the inflection points.

Table 4 :
The statistics of Zipf distributions for the absolute frequencies.

Table 5 :
Zipf exponent of SSE return for different time scales.