Bankruptcy prediction is an important problem facing financial decision support for stakeholders of firms, including auditors, managers, shareholders, debtholders, and potential investors, as well as academic researchers. Popular discourse on financial distress forecasting focuses on developing the discrete models to improve the prediction. The aim of this paper is to develop a novel hybrid financial distress model based on combining various statistical and machine learning methods. Then multiple attribute decision making method is exploited to choose the optimized model from the implemented ones. Proposed approaches have also been applied in Iranian companies that performed previous models and it can be consolidated with the help of the hybrid approach.
Listed companies financial distress prediction is important to both listed companies and investors. However, due to the uncertainty of business environment and strong competition, even companies with perfect operation mechanism have the possibility of business failure and financial bankruptcy. So whether listed companies financial distress can be forecasted effectively and timely is related to companies’ development, numerous investors’ interest, and the order of capital market [
Most topical studies have adopted a multiplevariable approach to the prediction of financial distress by combining accounting and nonaccounting data in a variety of statistical formulas [
On the other hand, the recent researches have exploited multiple attribute decision making (MADM) methods in financial analysis to improve the final outputs [
This paper puts emphasis on optimizing the financial distress forecasting in the case of listed companies in Tehran Stock Exchange (TSE) of Iran by the hybrid approach which outperforms existing discrete models significantly. Because of the importance of financial ratios to describe a company’s situation, factor analysis is applied to summarize the effect of financial ratios; then the combinations of all of them are exploited. Subsequently, the extracted predictors are utilized to forecast financial distress in a hybrid approach through traditional statistical modeling distress and machine learning methods for classifying business. In this analysis, another important issue is homogenizing business via clustering method to improve prediction models. Also, MADM method is used to distinguish the best model via different classification performances measures. Consequently, a comparison of the final results shows that the prediction of the financial distress is significantly consolidated.
The paper is organized as follows: Section
The early prediction of distress is essential for companies and investors or lending institutions that wish to protect their financial investments. As a consequence, modeling, prediction, and classification of companies to determine whether they are potential candidates for financial distress have become key topics of debate and detailed research.
Corporate bankruptcy was first modeled, classified, and predicted by Beaver [
Furthermore, most recent studies have adopted a multiplevariable approach to the prediction of financial distress by combining accounting and nonaccounting data in a variety of statistical formulas. In the reviewed literature, 64% of all authors used statistical techniques whose overall predictive accuracy was 84%. 25% of the authors used machine learning models whose overall accuracy was 88%, and 11% of the authors used theoretical models whose accuracy was calculated as 85% [
Some recent researches in financial distress forecasting.
Method(s)  Author 

Applying support vector machines to bank bankruptcy analysis using practical steps [ 
Erdogan, 2013 


Presenting particle swarm optimization techniques to obtain appropriate parameter settings for subtractive clustering and integrates the adaptivenetworkbased fuzzy inference system (ANFIS) [ 
Chen, 2013 


Applying a simple hazard model to develop an early warning system of bank distress in the gulf cooperation council countries [ 
Maghyereh and Awartani, 2014 


Presenting a statisticsbased wrapper for SVMbased financial distress identification by using statistical indices of rankingorder information from predictive performances on various parameters [ 
Li et al., 2014 


Examining the effect of the filter and wrapper based feature selection methods and applying different classification techniques [ 
Liang et al., 2015 
In general, the investigation of the studies carried out on the value of data of financial cases of bankruptcy prediction shows that the accounting data are able to predict the financial distress in the companies. We must, however, consider this point that there is no high unity (of views) regarding the kind of the financial ratios used in prediction of financial distress and the yielded results according to different financial ratios and methods of research. In this research, some ratios that have a high unity of views are used [
In this section, the methods applied in our paper are briefly described. Factor analysis,
Factor analysis is a dimension reduction method of multivariate statistics, which explores the latent variables from manifest variables. Two methods for factor analysis are generally in use, principal component analysis and the maximum likelihood method. The main procedure of principal component analysis can be described in the following steps when applying factor analysis [
Find the correlation matrix (
Find the eigenvalues (
Consider the eigenvalue ordering (
According to Kaiser [
Name the factor referring to the combination of manifest variables.
This method clusters
Let
In practice, Fisher’s rule is typically not directly applicable, because the parameters are usually unknown and need to be estimated from the samples. Let
Binary responses, for example, success and failure, are the most common form of categorical data and the most popular model for them is logit model. For a binary response,
A decision tree (DT) is a machine learning technique used in classification, clustering, and prediction tasks. A wellknown treegrowing algorithm for generating DT is Quinlan’s ID3 [
The set attribute
In other words,
The decision tree contains leaves, which indicate the value of the classification variable, and decision nodes, which specify the test to be carried out. For each outcome of a test, a leaf or a decision node is assigned until all the branches end in the leaves of the tree [
Neural network is a technique that imitates the functionality of the human brain using a set of interconnected vertices. It is based on an artificial representation of the human brain, through a directed acyclic graph with nodes (neurons) organized into layers. In typical feedforward architecture, there are a layer of input nodes, a layer of output nodes, and a series of intermediate layers. The input signals are multiplied by their corresponding weights to give the value of
The value calculated from (
TOPSIS (technique for order preference by similarity to an ideal solution) method is a popular approach to MADM (multiple attribute decision making) that has been widely used in the literature. It presented by Hwang and Yoon consists of the following steps [
The decision matrix is normalized through the application of
A weighted normalized decision matrix is obtained by multiplying the normalized matrix with the weights of the criteria,
Positive indicator score (
The distance of each alternative from
The closeness coefficient for each alternative (
At the end of the analysis, the ranking of alternatives is made possible by comparing the
Corporate bankruptcy forecasting plays a central role in academic finance research, business practice, and government regulation to financial decision support. Consequently, accurate default probability prediction is extremely important.
The main purpose of this study is not only improving the prediction performance model through hybrid analysis approach but also employing a multiple attribute decision making (MADM) method to make optimum decision for choosing the best alternative classification. Figure
The flowchart of the optimization approach.
As shown in Figure
Another suggestion in this research to present more sophisticated models is homogenizing of the company’s performance. It is fairly clear that there are a variety of businesses which can bring about the inefficiency of forecasting models. As a solution, this step tries to cluster businesses based on an influenced ratio and then exploit forecasting methods. A comparison among performance measurements presents remarkable improvement among recent financial distress modelings.
Subsequently, the extracted predictors from the factor analysis are utilized to forecast financial distress through traditional statistical modeling distress and machine learning methods in each cluster separately.
As the last but not the least step, based on the different classification performances measures, we try to choose the best model from the data set. It is because up till now there is no best classification method which can cover the best score in all evaluation measures. Different multiple attribute decision making (MADM) methods often produce different outcomes for selecting or ranking a set of decision alternatives involving multiple attributes. The TOPSIS is one of the famous MADM methods used to distinguish the best model. Also, other MADM methods can be applied. Consequently, the final results show that the prediction of the financial distress is significantly consolidated.
In the following, empirical evidence related to the proposed approach is also presented. Population under the study is the accepted manufacturing companies in Tehran Stock Exchange (TSE) for one year ended on March 21, 2011. The reason for this choice is the availability of financial information of these companies. There are 461 companies listed in TSE with 37 industry groups, of which 412 are manufacturing companies and 49 are the nonmanufacturing ones. The number of the manufacturing companies is more than other listed companies subject to granting more loans due to their extensive activities. A sample of 180 companies is chosen for this research.
In Tehran Stock Exchange, the measure for companies exiting capital market is the commercial law of 141 acts. According to those acts, companies are known as bankrupt whose retained losses are more than 50% of their capital. 58 companies are bankrupt under this law. The rest of nonbankrupt companies were randomly selected from the remaining list.
In this research, some ratios that have a high unity of views for 180 manufacturing companies quoted in Tehran Stock Exchange for one year (year ended on March 21, 2011) were used. The required data to calculate the ratios have been gathered from companies’ balance sheets and income statements. The financial ratios used in the prediction are listed in “The Definition of Variables Used.”
Popular discourse on financial distress prediction deals with the selection of important variables because of enormous financial ratios [
Estimated factor loading.
Factor 1  Factor 2  Factor 3  Factor 4  

TA  −0.06  0.84  0.19  0.21 
CR  0.08  0.94  0.11  −0.07 
QR  0.06  0.96  −0.03  0.02 
DR  −1  −0.04  −0.03  −0.01 
DER  −0.75  −0.04  −0.26  −0.39 
SI  0.05  −0.03  −0.18  0.62 
TLDS  −0.09  0.02  −0.09  0.91 
SFA  0.02  0.11  0.82  0.02 
STA  0.14  0.01  0.86  −0.02 
ROS  1  0.03  0.03  0.02 
ROE  0.98  0.09  0.07  −0.01 
CLOE  −0.03  0.26  −0.33  0.68 
TLOE  0.03  −0.56  −0.1  0.88 
WCTA  1  0.05  0.04  0.01 
It is fairly clear that the first factor represents the debt and cash flow conditions (a strong combination of DR, DER, ROS, ROE, and WCTA). The second factor almost symbolizes the liquidity conditions (a strong combination of TA, CR, and QR), and the third one approximately denotes the operating performance of the listed companies (a strong combination of SFA, STA). The last factor is the investment conditions (a combination of SI, TLDS, CLOE, and TLOE). For example, we have
Then, thanks to the existing difference in the size and type of the industries, the listed companies have not been homogeneous, causing the inefficiency of prediction models. As a solution, we tried to cluster listed companies by using the
Cluster means of 3mean method.
Cluster  1  2  3 

CR  0.974  3.425  9.620 
In the next step, the usual statistical distress prediction methods are applied. First of all, the logit models are
The next statistical method for bankruptcy prediction is discriminant analysis. Table
The linear discriminant functions for financial distress.
Cluster 1  Cluster 2  Cluster 3  

Distress  No distress  Distress  No distress  Distress  No distress  
Constant  −4.06  −1.61  −10.16  −6.26  −5.28  −3.43 

6.81  −0.53  13.11  −0.23  14.07  3.43 

0.69  −0.13  −0.3  0.36  −0.38  −0.38 

5.04  4.79  10.38  6.73  2.27  2.69 

3.21  2.11  5.07  3.67  0.68  −0.35 
In addition, to implement the machine learning methods, decision trees and neural network are applied where analysis for each cluster is not necessary separately. This is because both methods can be used for both classification and clustering purposes. Actually, in this part, the important issue is to utilize all ratios and extract factors as predictors separately.
To classify the distressed companies through decision tree, 100 companies were randomly used to build the tree and the rest for test (see Figure
A part of decision trees by extracted factors.
It means that if, for a company,
At last, neural network is implemented by three hidden layers and hyperbolic function. The final result shows the overall accuracy of 94 percent, which is 12 percent less than all ratios directly used.
On the other hand, there are four models to forecast financial distress (in fact, there are more classification methods that can be applied), and we have to present the best one from our data set. As mentioned above, the TOPSIS method is used to choose the best model based on the different classification performances measures. Table
Classification performances measures.
TP  FP  TN  FN  Accuracy  Error rate  Precision  Sensitivity  Specificity  AUC  

Logit analysis  0.64  0.12  0.16  0.08  0.80  0.20 

0.89  0.57  0.69 
Discriminant analysis  0.60  0.13  0.17  0.10  0.77  0.23  0.82  0.86  0.57  0.65 
Decision tree  0.74  0.01  0.20  0.05 


0.80  0.94 


Neural network  0.73  0.02  0.21  0.04 


0.78 

0.91  0.83 
As mentioned, there are various methods to choose an ideal alternative from MADM problems and as stated before, TOPSIS is one of them. In the TOPSIS approach, the best alternative is the nearest one to the ideal solution and the farthest one from the negative ideal solution. Also, it is assumed that all the criteria have identical weights and importance. Table
Calculation of the TOPSIS method.








CC  

Logit analysis  0.46  0.63  0.52  0.49  0.37  0.46  1.86  1.00 

Discriminant analysis  0.45  0.72  0.51  0.47  0.37  0.43  2.07  0.90 

Decision tree  0.54  0.19  0.49  0.52  0.62  0.56  1.08  1.85 

Neural network  0.54  0.19  0.48  0.52  0.59  0.55  1.11  1.81 

Based on the last column of Table
The enterprise bankruptcy forecasting has always been an important issue in the business and financial decision support. In this research, applying a hybrid approach is suggested to improve the prediction performance and give more supportive results.
First of all, factor analysis was used to determine and summarize some combinations of financial ratios correlated together. After that,
Later, to predict the financial distress, the multiple logistic regression analysis, the multiple discriminant analysis, the decision tree, and the neural network, that are all the famous methods in this field, were applied. Finally, the best model classifier was chosen with the help of multiple attribute decision making (MADM), the TOPSIS method.
The proposed approach, which has also been applied in Iranian companies in Tehran Stock Exchange which used to employ previous performance models, can be consolidated with the help of the hybrid analysis. The comparison among the used methods clearly showed that the decision tree and then the neural network had a remarkable performance in comparison to others.
The hybrid approach advanced provides insight into the complex interaction of the common bankruptcy prediction methods and suggests avenues for applying MADM methods in this area in the future research.
The total asset
The ratio of the current assets to the current liabilities
The ratio of the amount of cash and equivalents, short, and accounts receivable, term investments to the current liabilities
The ratio of the total liabilities to the total assets
The ratio of the total liabilities to the total owner’s equity
The ratio of the sales to the number of the inventories
The ratio of the total liabilities to the daily sales
The ratio of the sales to the fixed assets
The ratio of the sales to the total assets
The ratio of the net income to the sales
The ratio of the net income to the average owner’s equity
The ratio of the current liabilities to the owner’s equity
The ratio of the total liabilities to the owner’s equity
The ratio of the working capital to the total assets.
The authors declare that there is no conflict of interests regarding the publication of this paper.