Analyzing the Association between Pattern and Returns Using Goodman–Kruskal Prediction Error Reduction Index (λ)

For selecting and interpreting appropriate behaviour of proportion between buy/neutral/sell patterns and high/moderate/low returns, the prediction error reduction index is a very useful tool. It is operationally interpretable in terms of the proportional reduction in error of estimation. We first obtain the buy/sell pattern using an Optimal Band. ,e analysis of the association between patterns and returns is based on the Goodman–Kruskal prediction error reduction index (λ). Empirical analysis suggests that the prediction of returns from patterns is more impressive or of less error as compared to the prediction of patterns from returns. We demonstrated the prediction index for Index NIFTY 50, BANK-NIFTY, and NIFTY-IT of NSE (National Stock Exchange), for the period 2010–2020.


Introduction
In the domain of the stock market, there have been several studies well formulated on the relationship between two financial variables. In this domain, the association or relation is very important. With the help of one known variable, one can predict/estimate other unknown variables. ere are a number of literature studies on the association between different kinds of financial variables, like exchange rates, stock prices, returns, volatility, and many more factors. Here, we discussed some of the literature.

Related Work.
e relationship between market sentiment index and stock rates of returns in the Brazilian market is explored in [1]. e relation between common stock returns, trading activity, and market value is explored in [2].
ere are relationships between strange variables such as Quantile relationships between oil and stock return is presented in [3], which is evidence from emerging and frontier stock markets. Further, the relationship between music sentiment and stock returns presented in [4]. Moreover, when the market drives investors crazy, the relationship between stock market returns and fatal car accidents is presented in [5]. Firm efficiency and stock returns during the COVID-19 crisis are discussed in [6]. Football sentiment and stock market returns are expounded in [7]. Apart from these, there are many more association between investor sentiments and market returns as in [8]. In [9], authors discussed the relationship between firm size and international content of earnings, while in [10], authors discussed the relationship between transaction cost and small firm effect.

Motivation.
In the above literature, there is some gap of relation between patterns and returns. Based on some known pattern of a particular index, we predict the future returns of the index. In most back-testing processes, returns is a known variable for us, and based on this variable, we predict pattern or combination of different patterns that depend on the price series of the index. We test our pattern by back-testing used in the current market that is combination of pattern and returns.
is is the major gap of research on relation between these two variables based on different price series. In this research article, we try to fill this gap. is is our main motivation behind this study.

Objective.
In this research paper, we discuss the association between pattern and returns with prediction error reduction index on either side of prediction, pattern from returns and returns from pattern. ese two variables play a very important role in the stock market. When developing an investment strategy and selecting index or stocks for our portfolio, the association can be a very helpful tool. ere are various algorithms and models available in the literature, for predicting the pattern of financial time series [11][12][13][14]. In the modern era, the most trendy pattern prediction technique is artificial intelligence. ere is a lot of research analysis of pattern prediction using AI algorithm [15][16][17][18][19]. Each model has its own advantages and limitations associated with it and also shows an error when we execute it. For example, these models have prediction errors while predicting the buy/sell pattern. We can use appropriate preprocessing techniques which can help in reducing the prediction error significantly. We construct such a method by using the Goodman-Kruskal index [20]. Goodman-Kruskal's lambda has been widely used in applications. Jaroszewicz et al. used the lambda for constructing a minimal classifier for cancer data [21,22]. Taha and Hadi used it to compare the performance of a new measure of association [23].
Here, we initially constructed a two-dimension contingency table using the count of elements of pattern (buy/ neutral/sell) and returns (high/moderate/low). e count of contingency tables depends on the trading/investing way or strategy. ere are a number of strategies and trading styles to construct contingency. In this paper, analysis is done using Optimal Band to classify the financial data into patterns and returns; for more details see reference [24]. Some details of Optimal Band and the construction of the contingency table are as follows: (i) e construction of Optimal Band is based on the global and local extremums of given financial time series data. (ii) is gives a two-dimensional contingency table which consists of two variables, returns and pattern. (iii) e table can be constructed in two different ways: (1) Optimal Band divides the pattern data into three categories of sell, neutral, and buy and then uses each of these categories for prediction of returns (high, moderate, and low) (2) Optimal Band divides the returns data into categories of high, moderate, and low and then uses each of these categories for prediction of patterns (sell, neutrals and buy).
With the help of these tables, we find the prediction error of the data with the help of the Goodman-Kruskal index of prediction proportion (λ) [25][26][27]. Based on different values of λ, we decide which way is better: prediction of pattern from returns or prediction of returns from pattern, that is, whether to categorize the pattern data first or to categorize the returns data first.
Here, we proposed a noble method to find the perfect pattern using given returns in back-testing data based on Goodman-Kruskal λ. And then, we use the same pattern to find returns in the live market. We defined the different kinds of pattern and returns as in the research article by Vijay and Paul [24]. We analyze the statistical significance of λ, using errors defined as e remaining part of the paper is structured as follows: Section 2 contains the algorithm for construction of a contingency table and uses it to obtain the prediction error reduction index proportion (λ). e methodology is demonstrated with empirical analysis and their results for Index NIFTY 50, BANK-NIFTY, and NIFTY-IT data from 2010 to 2020 in Sections 3-6. Conclusion of the work for all index data from 2010 to 2020 is provided in Section 7.

Proposed Methodology
Consider the daily close price time series Y 1 , Y 2 , . . . , Y n of a stock. We define the process of construction of contingency table for two variables, returns and pattern, of financial time series. We divide the data into three categories, sell, neutral, and buy, of patterns using the classifier, Optimal Band [24]. In this section, a brief summary of the construction of an Optimal Band is given; for more details, see [24]: Step 1: define (2) Step 2: define the linear function as Step 3: the following optimization problem is now solved to estimate the parameters a, b, c, and d: 2 Complexity (4) Step 4: define the bands, for 1 ≤ i ≤ n − 5, ese bands are used to divide the pattern data into three of its categories, sell, neutral, and buy, as shown in Table 1, that is, where Y S , Y N , and Y B are, respectively, the sell, neutral, and buy categories.
Let us define new variables as follows: |Y S | is the cardinality of the subsets of sell |Y N | is the cardinality of the subsets of neutral |Y B | is the cardinality of the subsets of buy Now, we find the prediction error of single variable of pattern [22].
We further divide each categories of sell, neutral, and buy of patterns using the same classification technique into subcategories (high, moderate, and low) of returns [24]. We construct Table 2 which is the 2-dimensional contingency table [28].
From the table, Y SH is the cell corresponding to pattern (sell) and returns (high). e error of prediction for Table 2, where column I � C-I, Maximum � M, sum � S, and Total � T is given by Goodman-Kruskal prediction error reduction index (λ): Goodman and Kruskal introduced the idea of proportional reduction in error (PPE) of prediction [25]. e value of λ measures the association of nominal variables for cross tabulations. e value of λ depends upon the proportions of the constructed model. e value of measure of association λ represents the reduction of error of dependent variables (pattern or returns) for a given value of independent variables (returns or pattern). For any given data of a nominal independent variable and dependent variable, it indicates the extent to which the model categories and frequencies for each value of the independent variable differs from the overall model category and frequency, denoted by λ. It can be calculated using the following equation: where ϵ 1 and ϵ 2 are defined in equations (6) and (7), respectively. e range of λ varies between 0 (zero association) and 1 (compete association).

Experiments and Results
In this section, we implement the classification method, Optimal Band, and Goodman-Kruskal prediction error reduction index (λ), using the daily returns of Index NIFTY 50 for the year 2010. We use Optimal Band to classify the data into three categories of pattern (sell, neutral, and buy). We plot the data with Optimal Band to create the three categories of pattern as shown in Figure 1. For a detailed explanation of Figure 1, please refer to reference [24]. Each of these categories of pattern is further divided into three subcategories of returns (high, moderate, and low) using Optimal Band (Table 3). Table 4 is the table of counts of different categories of patterns constructed by using the algorithm given in Section 2.
e highest proportion corresponding to sell implies that the best prediction of new instance of Index NIFTY 50 of year 2010 data might fall into the sell category as this category consists of the largest number of items in the observed data set. In this case, we are assuming the sample proportion to be an unbiased reflection of the general population of data set.
e estimated probability proportion of correct prediction is 146/247 � 0.5911, and the estimated probability prediction error is Now, these categories of pattern are concurrently divided into three further categories of returns (high, moderate, and low).  In this case, the prediction error is refined. Table 3 represents that the data set belongs to the sell category of pattern. e best category of returns is moderate. Similarly, if the data set belongs to neutral and buy categories, the respective best prediction of returns is moderate and high. e refined estimated probability of prediction is (42 + 115 + 29)/ 247 � 0.7530, and the estimated probability error is e probability of prediction error is ϵ 1 � 0.4089, as the association between pattern and returns is not established. Once the association is established, the error reduces to ϵ 2 � 0.2470. e Goodman-Kruskal prediction error index gives the measure of proportion by which the prediction error is reduced in aforementioned situations [27]. e following equation gives the value of lambda (λ 1 ) for the case of predicting returns from pattern: In equation (11), lambda is asymmetric in nature [25]. We turn things around so as to make categorical predictions of pattern from returns.
Our best bet in the absence of information about pattern would be moderate, due to the returns category with the largest number of instances (see Table 5). e initial estimated probability of error in this case would be 1 − (155/ 247) � 0.3723. Once we factor the relationship between pattern and returns, we could refine the guesses by predicting low when data are sell category; moderate when data are neutral category; and high when data are buy category. e estimated probability of correct prediction would now be (43 + 104 + 36)/247 � 0.7409 as shown in Table 6, the estimated probability of error would be 1 − 0.7409 � 0.2591, and the proportionate reduction in prediction error lambda (λ 2 ) is Now, we extended our analysis for the Index NIFTY 50 from 2010 to 2020 to find the value of λ of returns from pattern and pattern from returns for each year as shown in Table 7. Also, we extend the analysis for other indexes BANK-NIFTY and NIFTY-IT for same period of time 2010-2020 (see Tables 8 and 9).
In Tables 7-9, the values of λ 1 and λ 2 represent the prediction error reduction index corresponding to Index NIFTY 50, BANK-NIFTY, and NIFTY-IT. Also, Table 10 shows the average value of λ 1 and λ 2 representing the average prediction error reduction index corresponding to some stocks. ese tables have column λ value prediction error reduction index for returns from pattern and pattern from returns. e value of λ is more in case returns from pattern than in pattern from returns. If this factor is more, it means prediction error is going to be reduced and prediction will be more perfect. Reduction indexes minimize the error that occurs during the analysis of data.

Recession Periods
A financial crisis is any of a number of scenarios in which certain financial assets lose a significant portion of their nominal value all of a sudden. Numerous financial crises were coupled with banking panics throughout the 19th and early twentieth centuries, and many recessions corresponded with these panics, as illustrated in Figure 2. Stock market collapses and the bursting of other financial bubbles, currency crises, and sovereign defaults are examples of circumstances that are commonly referred to as financial crises. However, there is no agreement, and financial crises of the sort described in the following continue to occur from time to time: (i) Banking crisis (ii) Currency crisis (iii) Speculative bubbles and crashes (iv) International financial crisis Here, we will discuss major financial crises such as the Asian Financial Crisis of 1997 (2 July 1997  Year Returns from pattern Pattern from returns Year Returns from pattern Pattern from returns  Table 9: Prediction error reduction index of Index NIFTY-IT for the period 2010-2020.

Stocks
Returns from pattern Pattern from returns   resulting in a number of bank failures in Europe and sharp drops in the value of equities (stocks) and commodities worldwide. e most recent stock market catastrophe happened in 2020 (24 Feb 2020). is crash was part of a worldwide recession caused by the COVID-19 pandemic.
During these financial crises mentioned above, the mechanism of selecting pattern does not vary. However, pattern selection varies, and it may be biased toward short or long patterns. In back-testing during these financial crises, the error pattern from returns is as shown in Table 11, which is much lower than the error pattern from return. Table 11 shows the yearly average prediction error index for patterns in terms of λ 1 and λ 2 in four financial crisis periods of 1997, 2000, 2008, and 2020 for NIFTY 50, NIFTY-IT, and BANK-NIFTY. In case of NIFTY-IT, the value of λ 2 is higher than λ 1 that means errors occur more in selection patterns from returns and all other patterns are selected smooth.

Comparison with Related Work
Presently, there are lots of research works on association between two or more variables. Here, we define the association between returns and patterns based on back-testing and live trading prediction of returns. In back-testing, we have returns of the data and try to find patterns with the Goodman-Kruskal prediction error index, and in live trading, we have back-tested patterns and predicted returns of future data. Most research works concentrate on prediction of future data pattern without knowing back-testing data pattern accuracy. But here, we try to recommend strong back-testing patterns using Goodman-Kruskal prediction error index.

Scalability to Economic Significance and Practical Implications
Stock markets are critical to the economy's functioning since they serve as the backbone of a contemporary nation's economic infrastructure. Companies can use stock markets to obtain funds to expand, recruit more qualified employees, and repair or replace equipment. Individuals can also invest in businesses through these platforms. Stock exchanges provide companies the ability to raise capital to expand their businesses. When a company needs to raise money, they can sell shares of the company to the public. ey accomplish this by listing their shares on a stock exchange. Annual reports help investors analyze the performance of companies listed on an exchange.
Investors can purchase shares in public offerings, and the funds collected are deployed by the firm to expand operations, acquire another company, or hire extra employees. All of this contributes to an increase in economic activity, which serves to propel the economy forward. e banking sector, the information technology industry, the pharmaceutical business, and other manufacturing industries all contribute significantly to the country's economic growth. In this study, we look at three NSE (National Stock Exchange) indices, NIFTY 50, BANK-NIFTY, and NIFTY-IT, which cover vital business stocks that are a large part of our economy's growth, as shown in Figure 3. Investors can use our research to determine the optimum pattern for investing in indexes (futures) or stocks. From 2010 to 2020, Table 12 displays the results of pattern selection based on returns in terms of average prediction error reduction index of indices. Table 12 shows that from 2010 to 2020, the error for pattern selection from returns will be decreased.

Conclusions
Here, we conclude from the whole analysis that the prediction of association between two variables is very important, but the way you predict the association is also very important. In economic analysis, any economic factor has two ways, top to bottom and bottom to top. In a similar manner, we try to find the best way to predict the association from one known variable to an unknown variable, which has less prediction error.
In the present analysis, we find prediction analysis error index of patterns of returns from seen data or back-testing data and patterns of returns from the unseen data or future data. e reduction error index of pattern from returns is less which helps to collect better patterns based on given returns that are used in live data to predict returns.
Here, we use the classifier, Optimal Band, and the measure of association (λ) to find the Goodman-Kruskal prediction error reduction index. It works effectively to find the error in prediction. We did the analysis in two ways to classify the data for association, from returns to patterns, and in a reverse way from pattern to returns, using Optimal Band. We observe that the prediction error reduction index of returns from patterns is more than that of patterns from returns using Goodman-Kruskal index (λ) for all data sets. Data of Index NIFTY 50, BANK-NIFTY, NIFTY-IT, and stocks for 2010-2020 were used; if the prediction error reduction index lambda is more, error is less. is lambda that predicts the returns from pattern is better than that which predicts the patterns from returns in all three indices. e constituent (stocks) of the indices also follow the same pattern. e prediction of returns from this pattern is better. In 2014, BANK-NIFTY had a lower prediction error reduction index, followed by NIFTY 50 in 2016 and NIFTY-IT in 2014. Also, we make good selection of patterns in different financial crises for NIFTY 50, BANK-NIFTY, and NIFTY-IT.

Data Availability
Data will be made available on request to the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest.