Stock Market Trading Rules Discovery Based on Biclustering Method

The prediction of stock market’s trend has become a challenging task for a long time, which is affected by a variety of deterministic and stochastic factors. In this paper, a biclustering algorithm is introduced to find the local patterns in the quantized historical data. The local patterns obtained are regarded as the trading rules. Then the trading rules are applied in the short term prediction of the stock price, combinedwith theminimum-error-rate classification of the Bayes decision theory under the assumption ofmultivariate normal probability model. In addition, this paper alsomakes use of the idea of the streammining to weaken the impact of historical data on the model and update the trading rules dynamically. The experiment is implemented on real datasets and the results prove the effectiveness of the proposed algorithm.


Introduction
The trend forecasting of the stock market has been a hot research field for a long time.However, it is influenced by many factors such as political events, general economic conditions, and traders' expectations, which make the stock market trend prediction become a challenging task.
The fundamental analysis is one of the main methods in the stock market analysis, which is based on the macro economy, the basic information of the companies, including profitability, industry prospects, and liabilities.Investors need to consider all the factors when they buy or sell a stock.
Technical analysis is another kind of method in the stock market analysis.It summarizes the typical rules in the market and forecasts the future trend by analyzing the historical price and the trading volume of the stocks.According to the efficient market hypothesis in 1960s and 1970s [1,2], investors can quickly and effectively utilize the potential information in buying and selling stocks, which means all the factors affecting the stock price have been reflected in the price of the stock.Therefore, buy-and-hold (i.e., random selection) is the optimal strategy, and technical analysis of stock is invalid.Whereas the subsequent research result has given a different conclusion, a number of technical analysis methods have emerged, ranging from traditional time series approaches to artificial intelligence techniques.
Because the stock price is a special kind of time series data, the traditional technical methods to predict the stock price are mainly time series analysis based on statistical models, such as autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) [3,4].However, due to the high noise and the nonlinearity of the stock market, those methods are not satisfying.Therefore, a variety of advanced time series methods have been proposed, which are used to predict the stock time series data.
Allen and Karjalainen [5] generate the trading rules by the genetic algorithm (GA), which use the arithmetic and logical functions to combine the basic technical indicators.
Potvin et al. [6] generate short-term trading rules of 14 Canada companies' stocks by genetic programming which is an extension of GA on the basis of the historical pricing and transaction volume.When the market falls or when it is stable, the trading rules are generally useful, while when the market is rising, the trading rules do not provide any improvement over the buy-and-hold (BAH) approach.
An improved bacterial chemotaxis optimization (IBCO) is established by Zhang and Wu [7] as an effective prediction model for the prediction of different kinds of stock price indices, which is integrated into the back propagation (BP) artificial neural network.
Chang and Liu [8] developed Takagi-Sugeno-Kang (TSK) type fuzzy rule based system for stock prediction.Their TSK fuzzy model introduced the technical indices as the input variables and the output is a linear combination of the input variables.
A trading system based on multiobjective particle swarm optimization was proposed by Briza and Naval [9].By using the trading signals from a set of technical indicators, the system develops a trading rule which is optimized for two objective functions: Sharpe ratio and percent profit.
An intelligent hybrid stock trading system was designed by Chavarnakul and Enke [10], which integrated neural network, fuzzy logic, and genetic algorithm.The rules are generated based on single technical indicator-volume adjusted moving average (VAMA).The system shows the advantages of different methods, allowing the investors to make better stock trading decision.
Leigh et al. [11] implement a recognizer for two variations of the "bull flag" technical charting heuristic.They use this identifier to find trading rules and prove its validity.
Although the reliable basis for the technical analysis has not been set up, many investors and the market analysis staff use the technical indicators to analyze the stock price.Most of the technical analysis methods generate trading rules based on one or a few predefined technical indicators.However, it is not always effective by using a technical indicator or the combination of some technical indicators.As the stock prices are affected by a variety of deterministic and stochastic factors, the stock price model should not be static all the time.No technical indicator can be used to construct all the models.Therefore, when the model of the stock price changes, we should choose different technical indicators to build different models.
In order to solve the above problem, we use a data mining method-biclustering technique to find local patterns in the historical data where different patterns contain a subset of technical indicators with different periodic parameters [12].The model found is regarded as the trading rules, which are applied in the short term prediction of the stock price, combined with the minimum-error-rate classification of the Bayes decision theory under the assumption of multivariate normal probability model.In addition, this paper also makes use of the idea of the stream to update the trading rules dynamically and weaken the impact of historical data on the model.
The main works in this paper are as follows: (1) propose a new biclustering algorithm to find the trading rules in the quantized data; (2) provide guidance of the stock trading by the combination of the trading rules generated and the minimum-error-rate Bayesian decision method under the assumption of multivariate normal probability model; (3) update the trading rules dynamically and weaken the impact of historical data on the model based on the idea of stream; (4) validate the performance of the algorithm using three kinds of historical time series data from stocks and compare it with several classical technical analysis methods, including buyand-hold, genetic programming [6], and intelligent hybrid system [10].
The rest of the paper is structured as follows: in Section 2, the introduction of bicluster model is presented; in Section 3, we explain our algorithm in detail; Section 4 proposes the experiment method; Section 5 gives experimental results; Section 6 concludes the paper and gives the main direction of future work.

Bicluster Model
Clustering is one of the most important methods to solve the problem of machine learning, data mining, and pattern recognition problems.It has been widely used in various fields, such as biological gene, recommendation system, and financial analysis.The traditional clustering methods, such as hierarchical clustering [13], self-organizing maps [14], and -means clustering [15], cluster rows or columns, respectively, in a matrix.It could not find local coherent patterns which include subsets of rows and columns.Generally, the local patterns would be more beneficial for mining implicit information.For instance, a trader may care about a small number of technical indicators which could provide the most useful predictions to the market to make trading decisions.Thus, finding a subset of useful indicators, which have similar behaviors in turning points, is important to the analysis of the stock market.To solve this problem, we need biclustering methods, which are actually a special branch of clustering algorithms because it clusters the data both the row and the column simultaneously in a matrix.
Cheng and Church [16] firstly proposed biclustering terms in gene expression data analysis.After that, a series of biclustering algorithms were raised such as FLOC [17], OPSM [18], and Plaid model [19].
Given a   ×  data matrix, a bicluster could be defined as a submatrix whose row set and column set could be expressed as (, ), where  ⊆ {1, . . .,   } is a subset of rows and  ⊆ {1, . . .,   } is a subset of columns in the original matrix.The R (a subset of rows) share a certain kind of pattern in C (a subset of columns).
Madeira and Oliveira [20] divide the biclusters into four types: (1) biclusters with constant values; (2) constant values in rows or columns; (3) coherent values; and (4) coherent evolutions.Figure 1 explains ten typical examples for these four types of biclusters.Figures 1(a)-1(e) represent the first three which have the numeric values in the data matrix.The biclusters with coherent evolution are represented by Figures 1(f)-1(j).All kinds of biclusters have their own characteristic and are suitable for different data and fields.In this paper, the biclusters with constant values in columns are chosen.

Method
Technical analysis aims at predicting price trend by some rules based on the study of historical data in the market.These methods are based on figures or models which could be described by mathematical formulas that set historical data (e)  as input and the trading indicator as output.The rules found could help investors to make better decisions in the markets.Technical analysis is based on the assumption that there exist consistent behavior patterns that are time invariant associated with the stock price and would recur in the future; thus these patterns can be used for predictive purposes.As mentioned above, a bicluster is a submatrix which could be regarded as a local coherent pattern.Biclustering methods could find the coherent patterns in the stock data, which we used to generate the trading rules.
In this paper, we use the historical data of the stock or the financial comprehensive index to build the data matrix, where the rows are the trading days and the columns are the technical indicators and future return.The different combinations of the technical indicators mean different trading rules.Some of them (combinations of the technical indicators) appear in the training sets in many trading days.As Twain's often quoted saying "history does not repeat itself, but it does often rhyme".The patterns corresponding to those combinations will occur in the future and can be regarded as trading signals which imply the stocks rise or fall, so we can refer to the signals to make the decision to buy or to sell.In other words, when some technical indicators fall into a certain range, it can be regarded as the stock change signal.It is meaningful and significant that a technical indicator shows approximate or similar values in different days.Therefore, our method mine biclusters with constant value on columns in the data matrix.Considering that the local information and manifold information are both important for data clustering, Biclustering-Based Intelligent System (BIS) is proposed to mine the biclusters with constant values in columns inspired by the method of [21].
The procedure is stated as follows: (1) cluster each column in a data matrix by -means algorithm; (2) mine the biclusters with constant value in columns of the matrix.As a result, only those biclusters that contain both technical indicators and future return are taken into consideration.The detected biclusters are the stock trading rules.When the trading data of the future could match with the trading rules, to buy or sell signal will be determined to make the corresponding stock trading decisions.In addition, in order to find the trend of the current stock price and to weaken appropriately the influence of historical data for prediction, an innovative method based on support is introduced to update the trading rules in the testing process.The biclustering method can find local consistency model in time series data which is in accordance with the characteristics of stock data matrix.Although there is no complex step, it has obtained the good effect; the steps are stated as follows.

Data Preprocessing.
The trading historical data is organized into a matrix, as shown in Figure 2  The indicator of the first column indicates the return value of the th trading day in a week and its values are calculated according the following: where cp() is the closing price of the th trading day and cpw() is the closing price after one week.The 2nd-8th columns indicate moving average (MA) with 7 different time parameters.MA is a calculation to analyze data points by creating a series of averages of different subsets of the full dataset, which is calculated as follows: In the above formula, cp( − ) is the closing price of the ( − )th trading day.
The 9th-13rd columns are relative strength index (RSI) with different time parameters.The RSI intended to chart the current and historical strength or weakness of a stock or market based on the closing prices of a recent trading period.It is classified as a momentum oscillator, measuring the velocity and magnitude of directional price movements.It is calculated as follows: where op() represents the open price of the th trading day.The 14th-18th columns are Williams percentage range (%) with different time parameters.% shows the current closing price in relation to the highest and lowest of the past  days.It is calculated as follows: where   is the highest price over  previous periods and   is the lowest price over  previous periods.The 19th-23rd columns are the rate of change (ROC) with different time parameters.ROC shows the difference in the closing price between today and  days ago, which is calculated as follows: where  0 represents the data matrix before normalization while  is after normalization and min( 0 (:, )) and max( 0 (:, )) represent the minimum and maximum value of the th column, respectively.

Data Quantization.
It is believed that each indicator's value can be divided into several classes, and the data falling in the same class will have similar impact on the stock price though they are not strictly equivalent.Therefore, -means algorithm is used to cluster the normalized data into  classes and the data dropped in the same cluster will be quantized as a constant which reflects the center of the class.
In order to better identify the stock market's seesaw movements, we set threshold value .If the first column's value (future return) is greater than , it is replaced by 1.If it is less than −, it is replaced by −1, otherwise it is set as 0. Among them, 1 represents the rising of the stock, −1 represents the stock devaluation, and 0 represents that the stock price does not change significantly.The changes before and after quantization of the matrix are shown in Table 1, where  is 0.01.

Training.
The training set is a part of history data of one stock.The BIS algorithm is implemented to find the effective stock price prediction model, namely, the stock trading rules, and further to verify the validity of the stock trading rules in the testing phase.The Hang Seng index (HSI) here is introduced as an example, which spans nearly a year from May 3, 2005, to April 28, 2006.Because the return values of the 52th week are acquired in the next week after April 28, 2006, there are only 51 weeks' complete data.
(1) Proposed Biclustering Method.As an efficient data mining tool, biclustering technique is an extension to the traditional clustering methods, which allows simultaneously cluster rows and columns to find the submatrices where the rows show the same pattern in the corresponding column sets.A submatrix represents a local meaningful pattern hidden in the mass data.
The historical data of stock transaction usually forms a matrix where the rows correspond to the dates of the transactions and the columns correspond to the technical indicators.When some technical indicators of stocks fall in a specific range, the trader should make the decision to buy or sell, which are the trading rules.Therefore the trading rules can be represented as some turning points whose specific technical indicators fall in the same range, and the trading dates and technical indicators compose a matrix which accorded with the characteristics of biclusters with constant values in the columns.So our algorithm searches all those biclusters in the training matrix to get the stock trading rules and guide the stock trading.
In order to forecast the stock prices' change according to the historical data of stock trading, an effective trading rule must contain a return and several technical indicators.That is to say, a bicluster should contain two parts: a return (the 1st column) and the technical indicators.Based on the constraints, we raise a new algorithm to find the biclusters with constant values in the columns whose support is beyond the support threshold   , namely, BIS algorithm mentioned before.It starts with each column.Gradually the biclusters are merged into biclusters with constant values in two or more columns.The following is an example of detailed procedures.
After data preprocessing, the original data matrix  0 is transformed into matrix , as shown in Table 2; the threshold   is 2.
The process of the algorithm is described as follows: Step 1.After -means clustering in the data preprocessing, elements in each column have been clustered into several clusters and biclusters of one column BIC  () ( = 1, 2, . . ., ,  = 1, 2, . . ., ) are obtained, which means the th bicluster of the th column.Then add the biclusters of one column whose row numbers are beyond the row threshold   into the bicluster set BIC Set at the same time, and these biclusters are regarded as the set of bicluster seeds (BS) for the next step.The results are shown in Table 3.
Step 2. In BS, BIC 1 ( 1 ) ( 1 = 1, 2, . . ., ) in Step 1, that is, all the biclusters of the first column in  are combined with other biclusters BIC  ( 2 ) ( = 2, 3, . . .,  2 = 1, 2, . . ., ) to get the biclusters of two columns.The row sets of two merged biclusters are intersected and the column sets of two merged biclusters are joined.The results which could not satisfy the row threshold   are deleted.Finally all the satisfying biclusters of two columns are added to the bicluster set BIC Set and regarded as the new BS set.
As described in Step 2, each pattern of the first column in Table 3 is merged with patterns of the second column or patterns of other columns.The row sets of two merged patterns are intersected and the column sets are joined to get all the length-2 patterns including the first column.Since there are only two patterns in each column that meet the support in Table 3, at most four new models will be obtained after the merging operation.The result is shown in Table 4.
Step 3.All two biclusters in BS will be merged to get the biclusters of three columns.The satisfying biclusters beyond the support are added to BIC Set.Because all the biclusters of two columns found in Step 2 include the first column, the merged biclusters contain the first column either.Thus each bicluster in BIC Set has its own return value.For example, in Table 4 the patterns in ⟨1, 2⟩ are merged with each pattern in ⟨1, 3⟩, ⟨1, 4⟩, and ⟨1, 5⟩, respectively.The corresponding row sets are intersected to get the corresponding support row set of all the length-3 patterns containing the column set ⟨1, 2⟩.Then the patterns in column set ⟨1, 3⟩ are merged with other patterns in ⟨1, 4⟩ and ⟨1, 5⟩ to get all the length-3 patterns containing the column set ⟨1, 3⟩ and their corresponding row sets.The same operation is done with patterns in ⟨1, 4⟩ and patterns in ⟨1, 5⟩ to get all the length-3 patterns containing the column set ⟨1, 4⟩ and their corresponding row sets.The results obtained are shown in Table 5.
Step 4. Repeat Step 3 until BS is empty or the number of columns of the biclusters reaches .

According to
Step 4, the follow-up results are shown in Tables 6 and 7.
Step 5. Filter BIC Set; that is, discard the repeated biclusters and the biclusters whose column numbers are less than the column threshold   .

Matrix B
Obtain all biclusters with one column, Update BIC Set and BS  The flow chart of the BIS algorithm is shown in Figure 4.In BIS algorithm, each column (each index) is clustered by -means clustering; thus we do not need to set a specific threshold to measure the similarity among the indicators of different trading days.After merging, the intersection of the row sets ensures that the obtained biclusters are biclusters with constant value in columns; that is, those trading days share the same patterns in the corresponding indicator sets.It is detailed by the pseudo code in Algorithm 1.
(2) Generate the Trading Rules.After all the biclusters with the constant values in columns have been obtained, which satisfy the row support threshold   and contain the first column, the biclusters meeting the column threshold   are selected as the effective predicting models, as shown in Table 8.
The summary information of the satisfying biclusters is stored including its column set and the corresponding values as well as the row support.A bicluster in Table 8 is saved in Table 9.
Since each bicluster corresponds to a candidate prediction model, its summary information could be regarded as an effective trading rule.
If the future return of a transaction rule is 1, which shows a rising trend, there is a cue to buy; otherwise, if it is −1, which shows a falling trend, there is a selling signal.If the return is 0, no action should be taken.
After all the trading rules have been generated from the satisfying biclusters, all the data in the matrix is discarded except the data of the last week in this year, which is left with the data of the second year for updating the transaction rules dynamically.

Testing.
Another part of the historical data of the stock is set as the test data.Take Hang Seng index (HSI) as an example; the data from May 2nd, 2006, to April 30th, 2007, is set as the test data.The trend of the stock price for each trading day in a week is forecasted by the pregenerated trading rules.In addition, the new data is used to update the trading rules and weaken the influence of the historical data, like in stream mining.Next we will take the data on May 2nd, 2006, as an example to predict its stock price's trend in a week and update the trading rules by the new data.
(1) Predict.In order to forecast the trend of the stock price after the trading day, the daily data of each column is normalized firstly; then the matching degree with each transaction rule  is computed.The index's value of the trading rule  and the corresponding indices' value of this rule on this day are compared according to formula (7).The smaller the distance() is, the higher matching degree between the trading day and the trading rule  is: where ST is a collection of technical indices of the trading rule ;   is the number of the technical indices of the trading rules . rl () is the th index's value of the trading rules , and  td () is the technical indices' values corresponding to the trading day.
The trading rules whose distance() is less than the threshold  dis are set as the reference trading rule set of the trading day, and the process of the trading decision is stated as follows.
If there is only one trading rule, or the multiple trading rules have the same return, then to buy or sell the stock is decided by the return of the trading rule.When the return is 1, it suggests a purchase signal on the next trading day; if the return is −1, it suggests a signal to sell.If the return is 0, no action should be taken.
If there are more than one trading rules satisfying threshold  dis , and they do not have the same return value, then it is regarded as a classification problem to find the best trading rule and each trading rule is considered as a class.Each technical index of a transaction rule is assumed where  is the number of technical indices of the trading rule,  is part of the indices' value ( dimensional column vector) corresponding to the specific transaction rules, Σ  is the  ×  covariance matrix, here it is a unit matrix, and (  ) is the prior probability calculated in the following: where the support() is the support of the trading rule  and  is the total number of rows in the training data.We select the best trading rules (whose   () is the maximum) to buy or sell a stock according to its return value.An example is depicted to explain how to make the trading decision.
(1) Two biclusters are detected in the matrix, as shown in Tables 10 and 11.
(2) Generate the trading rules, as shown in Tables 12 and  13.
(3) The values of the set of indicators corresponding to the two trading rules for a specific trading day in testing period are shown in Tables 14, 15, and 16.
According to (7), it is obvious that the set of values matches both of the two trading rules well.
According to (8),  1 () >  2 (), rule 1 is the best trading rule, so we will sell the stock in the next trading day.
(2) Update the Transaction Rules Dynamically.It is believed that the most recent data affects the future stock price more than the historical data, which gradually reduces its influence as the time passes by.Therefore, the data stream is used to update the trading rules dynamically and weaken the influence of historical data.
When the data on May 2nd, 2006, is acquired, the return in a week of the earliest day (April 24th, 2006) retained could be obtained.It is combined with the retained data into a complete record, as shown in Table 17.The complete record is used to update the transaction rules.
The data is normalized and quantized; then the matching trading rules are found and the row support of the trading rules which match successfully are updated.For each of the trading rules, calculate the matching degree according to formula (7).If the value is 0, it shows that the day's data exactly matches the trading rule, so the support of the trading rule adds 1.If there are several matching trading rules, then all the corresponding supports add 1. Finally, the total number of rows in the training data also increases 1.
After the trading rules have been updated and the transaction decision has been made for May 2nd, 2006, delete the data of April 24th, 2006, and save the trading data of May 2nd, 2006.
In order to weaken the influence of the historical data on the trading rules, the supports of all the trading rules are multiplied by an attenuation coefficient  at the end of each month after the renewal of the trading rules by the month's new data.Then delete the trading rules whose row support is lower than the row threshold with consideration that these rules would not recur and therefore would impact little on the future trading decision.The process is repeated until the end of the test.

Experimental Methods
In order to evaluate the performance of the BIS algorithm, we compare it with 3 popular existing methods: (1) buyand-hold (BAH); (2) genetic programming (GP) [6]; (3) intelligent hybrid system (IHS) [10].The BAH is a classic and simple stock trading strategy whose trading strategy is to buy the shares on the first day and sell it at the end without intermediate operation.The GP extends classical genetic algorithms by allowing the processing of nonlinear structures.It provides a flexible framework for adjusting the trading rules to the current environment.The IHS integrates fuzzy logic, GA, and NN techniques to increase the efficiency of stock market when using VAMA.training dataset.The output of the BIS algorithm includes the profit rate and the number of transactions in testing.On the whole, the BIS algorithm outperforms the other three transaction methods in the simulation experiment.
From Table 21 we can see that the BIS algorithm could boost the profit rate by 6.99% than BAH strategy in the guidance of HSI stock trading (27.19% by BIS while 20.20% by BAH), which shows its significant advantage in profitability.From Table 22, the BIS method could obtain more profit than GP in the four stocks.Considering the average profit, the BIS algorithm (25.80%) could gain more than GP (23.24%) algorithm.The data in Table 23 shows that the BIS method can get better profits on the index in S&P 500; particularly in Trending-down market, the profit with the method of BIS (13.63%) is a significant improvement compared to the profit of the IHS method (−5.59%).
Through the above analysis of the simulation experiments, the proposed BIS algorithm has obvious advantages compared with other classical strategies in the stock trading.Although there are negative returns in some stocks (such as TRP), generally speaking, the BIS method can obtain better effect in mining stock exchange rules and assisting trading decisions.Figure 5 gives an example of guiding ABX stock trading by the BIS method from 03/14/2000 to 06/30/2000, which intuitively displays the process in the guidance of stock trading by the trading rules of the BIS method.The fold line describes the stock price fluctuations over time while the red dots and the blue dots on it represent stock exchange buying and selling points, respectively.From Figure 5, it can be seen that one could find the right time to buy at the bottoms and the right time to sell at the tops by the trading rules, indicating that BIS algorithm could predict the right time points of stock trading relatively accurately.

Conclusions
In this paper, we propose a new advanced time series analysis method to find multiple patterns in the fluctuation of the stock market from the historical data.To overcome the disadvantage of most existing algorithms that rely on the predetermined technical indicators, Biclustering-Based Intelligent System (BIS) could find different patterns which contain a subset of technical indicators with different periodic parameters.The patterns found are regarded as the trading

Figure 1 :
Figure 1: Examples of different types of biclusters.(a) Constant values in (b) rows and (c) columns.(d) Additive and (e) multiplicative coherent values.(f) Overall coherent evolution in (g) rows and (h) columns.(i) Coherent evolution values in columns.(j) Coherent sign changes in rows and columns.

Figure 2 :Figure 3 :
Figure 2: The historical data matrix of the stock market.

Figure 4 :
Figure 4: The flow chart of the BIS algorithm.
, in which the rows represent the trading days and the columns correspond to technical indicators where the first column is the return and the rest columns correspond to technical indicators with different time spans.In this paper, five popular technical

Table 1 :
(a) The matrix after normalization; (b) the matrix after quantization.

Table 3 :
Records of the support rows of each class in each column.

Table 4 :
Records of the support row sets of each class composed of 2 columns.

Table 5 :
Records of the support row sets of each class composed of 3 columns.

Table 6 :
Records of the support row sets of each class composed of 4 columns.

Table 7 :
Records of the support row sets of each class composed of 5 columns.
biclusters in BS pairwise

Table 19 :
Dataset of four Canadian companies stocks.

Table 21 :
Comparison with the BAH on Hang Seng index."Profits (%)" is excess return."Trades" is the number of trades.

Table 22 :
Comparison with the GP on four Canadian stocks."Profits (%)" is excess return."Trades" is the number of trades.