A Novel Ensemble Learning Approach for Corporate Financial Distress Forecasting in Fashion and Textiles Supply Chains

This paper proposes a novel ensemble learning approach based on logistic regression (LR) and artificial intelligence tool, that is, support vector machine (SVM) and back-propagation neural networks (BPNN), for corporate financial distress forecasting in fashion and textiles supply chains. Firstly, related concepts of LR, SVM, and BPNN are introduced. Then, the forecasting results by LR are introduced into the SVM and BPNN techniques which can recognize the forecasting errors in fitness by LR. Moreover, empirical analysis of Chinese listed companies in fashion and textile sector is implemented for the comparison of the methods, and some related issues are discussed. The results suggest that the proposed novel ensemble learning approach can achieve higher forecasting performance than those of individual models.


Introduction
In fashion and textile industry, there is a great deal of change due to global sourcing and high levels of price competition.In addition, fashion and textiles have market characteristics such as short product lifecycle, high volatility, low predictability, and a high level of impulse purchase [1][2][3].The rigorous competition and rapid change demand cause financial risk to corporations in a fashion and textile supply chain [4][5][6][7][8].
As financial risk may be infectious from one corporation to another within the supply chain, the prediction of corporate financial distress is important to fashion and textile supply chain management.
For better performance in fashion retailing, more and more research has been focused on forecasting, including sales forecasting [9][10][11], fashion retail forecasting [12] and color trend forecasting [13][14].However, in fashion and textile sector, little attention has been paid to corporate financial distress forecasting, which is important to various stakeholders (i.e., management, investors, employees, shareholders and other interested parties) as it provides them with timely warnings.From a managerial perspective, financial distress forecasting tools allow to take timely strategic actions so that financial distress can be avoided [15].Many traditional techniques have been presented to predict corporate financial distress, including univariate approaches [16], linear multiple discriminant approaches (MDA) [17,18], multiple regression [19,20], logistic regression [21] and factor analysis [22].However, strict assumptions of traditional statistics such as linearity, normality, independence among predictor variables limit their applications in the real world [23].
Due to limitations of traditional statistical and econometric models, some nonlinear and artificial intelligence (AI) models, including neural networks [24], case-based reasoning [25,26], support vector machine [27], have been used for corporate financial distress forecasting.However, individual forecasting methods have limited capability in the description of financial characteristics.In particular, to some complex forecasting problems, there may be a bias in the results when only an individual method is used [28].A more appropriate approach for improving the forecasting accuracy is the combination of individual methods, which always perform better than the worst individual model on predictions and sometimes can outperform the best individual model [29].Some hybrid methods have been used for corporate financial distress forecasting [30].In terms of experimental results, AdaBoost ensemble with single attribute test (SAT) outperforms AdaBoost ensemble with decision tree (DT), single DT classifier and single support vector machine (SVM) classifier.As a conclusion, the choice of weak learner is crucial to the performance of AdaBoost ensemble, and AdaBoost ensemble with SAT is more suitable for corporate financial distress forecasting of Chinese listed companies [31].Also, empirical results indicate that the integration of principal component analysis (PCA) with MDA can produce better performance in short-term financial distress forecasting of Chinese listed companies [32].However, there is not an ensemble learning approach that can improve the forecasting performance in one method by recognizing Type I error and Type II error in another method yet.This paper proposes a novel ensemble learning approach based on logistic regression (LR) and artificial intelligence tool, i.e. support vector machine (SVM) and back-propagation neural networks (BPNN) for the prediction of corporate financial distress in fashion and textiles supply chains.Firstly, related concepts of LR, SVM and BPNN are introduced.Then, the forecasting results by LR are introduced into the SVM / BPNN technique as an ensemble approach.Empirical analysis is implemented for the comparison of the methods, and some related issues are discussed.
The rest of this paper is organized as follows.The basic concepts of LR, SVM and BPNN are introduced in Section 2. We describe our proposed ensemble learning approach in Section 3. Section 4 presents empirical analysis to illustrate the proposed approaches and some related issues, including performance comparison and analysis.
Finally, we make a conclusion and discuss future research in Section 5.

Related concepts
In this section, concepts of LR, SVM and BPNN are introduced as follows.

Logistic regression model
In a logistic regression model (LR), dependent variable is always in categorical form and has two or more levels [33].In this study, we consider the situation where we observe a binary outcome variable  and a vector  = 1,  1 ,  2 , … ,   of covariates for each of  individuals.We code the two class via a 0/1 response   , where   = 1 is for the first class (financial distress) and   = 0 is for the second one (no financial distress).Let  be the conditional probability associated with the first class.In a logistic regression model, probability  of the dichotomous outcome event is related to a set of explanatory variables  as follows.
Let  = {   ,   :  = 1, 2, … , } be the training data set, which is a set of independent and identically distributed random variables.The regression coefficient   estimated from the data is interpretable as log-odds ratios or, in term of exp(   ), as odds ratios.The log-likelihood for  observations is used for estimating regression coefficients   as follows.
where    gives odds ratio and this value reflects the effect of indicators in financial distress.

Support vector machine
Support vector machine (SVM), proposed by Vapnik, has been proved to possess excellent capability for classification [34].The conventional SVM achieves classification by mapping the input vectors on to a high-dimensional feature space and by then constructing a linear model that implements nonlinear class boundaries in the original space.The SVM employs an algorithm that finds a special kind of linear model, i.e. the optimal hyperplane, which refers to the maximum-margin hyperplane and yields the maximum separation between decision classes.Thus, the optimal training examples, other than the support vectors, are useless for constructing the optimal hyperplane.As a result, it is possible for SVM models to effectively perform binary classification with a small size of training samples [28].
For the linearly separable case, a hyperplane, which separates the binary decision classes in the case of n attributes, can be represented as the following equation: where  is the outcome,   is the attribute value (i=1, …, n) and   is the weight of   learned by the algorithm.In Eq. ( 3), the weights are the parameters that determine the hyperplane.By using the support vectors, SVM models approximate the maximum-margin hyperplane as follows: where   is the class-value of the training example ().The problem of finding the support vectors and parameters  and {  } can be transformed into a linearly constrained quadratic programming (QP) problem.
For the linearly separable case, we assume that all data is at least distance 1 from the hyperplane     +  = 0.Then, given a training set of instance-label pairs (  ,   ), where     and   {1, −1}, the data points will be correctly classified by The SVM finds an optimal separating hyperplane with the maximum margin by solving the following quadratic optimization problem: Subject to   (    + ) ≥ 1 By adopting non-negative slack variables, we can transform Eq. ( 6) into Eq.( 7) as follows. Min Subject to       +  +   ≥ 1 By solving Eq. ( 7), we can find the hyperplane that provides the minimum number of training errors.For the nonlinear separable case, SVM models are able to undertake the classification by constructing a linear model that implements the nonlinear class boundaries by transforming the inputs into the high-dimensional feature space.In this case, Eq. ( 4) can be modified into a high-dimensional version as follows.
The function (() • ) is the kernel function which transforms the input vector into a high-dimensional feature space.Usually, there are 3 types of kernel functions: the linear kernel,  ,  =   ; the polynomial kernel,  ,  = (   + 1)  , where  is the degree of the polynomial kernel; the Gaussian radial basis function , where  2 is the bandwidth of the kernel.

Back-propagation neural networks
Also, in the problem of financial forecasting, the technique of back-propagation neural networks (BPNN) is always used as a benchmark model [35][36][37][38].The procedure to set up a BPNN includes: (1) Select input and output variables; (2) Determine layers and number of neurons in hidden layers; (3) Learn or train from real data; (4) Test; (5) Recall.Let q be the number of hidden nodes and p be the dimension of the input vector (the lagged observations).The relationship between the output ( t y ) and the inputs ( T x ) has the following mathematical representation: ) is the connection weight between the i th input node and the j th hidden node.The logistic function is often used as the hidden layer transfer function as follows: The BPNN model performs a nonlinear functional mapping from inputs ( T x ) to t y as follows.
where  is a vector of all parameters and f is a function determined by the network structure and connection weights.Thus, the neural network is equivalent to a nonlinear autoregressive model.However, individual methods for corporate financial distress forecasting have disadvantage for error recognition.Therefore, in the following section, a novel ensemble learning approach based on logistic regression and SVM / BPNN is proposed for error recognition and the improvement of forecasting performance.

A novel ensemble learning approach
Many hybrid methods have been used for economic forecasting [39][40][41], and risk analysis has addressed in supply chain management [42][43][44][45][46][47][48].When the methods of LR, SVM and BPNN are used for corporate financial distress forecasting, financial ratio indices of companies are input variables and corresponding corporate financial state (0/1) of companies is the output.The principle of the new ensemble learning approach based on LR and SVM/BPNN is that the forecasting results of corporate financial distress by LR are set as another input variable within the SVM/BPNN framework, corresponding to the output of corporate financial state.In this way, both Type I error (reject-true error) and Type II error (accept-false error) in LR analysis can be recognized by SVM/BPNN.In summary, the overall process of the ensemble learning approach (LR-SVM/BPNN) can be described in Fig. 1 as following three main steps: (1) Stepwise regression analysis is used to remove independent variables which are insignificantly linear with the dependent variable.In this way, remained independent variables are significant to the dependent variable and multicollinearity is removed.
(2) Logistic regression analysis is implemented based on train sample set.Then, critical value of financial distress probability is set as 0.5.If the probability is more than 0.5, the value is 1 and the company is predicted to be financial distress; or else, the value is 0 and the company is predicted to be no financial distress.
(3) The forecasting results by LR are introduced into the SVM/BPNN technique as a new variable.Then, there will be a new forecasting result by the LR-SVM/BPNN approach.

Fig. 1 Framework of the proposed ensemble learning approach
On the whole, Logistic regression is a linear model, while SVM/BPNN is nonlinear model.When the forecasting results by LR are introduced into the SVM/BPNN technique as an input variable, SVM/BPNN can recognize the forecasting errors in fitness by LR.Therefore, the ensemble learning approach LR-SVM/BPNN is promising to achieve better forecasting performance.We selected 15 companies once special-treated (ST) and 45 non-ST companies as samples, where ST are regarded as financial distress in this study.Also, we selected 12 variables and categorized them as four major types: Earning ability, operating ability, debt-repaying ability and growing ability.The details of these indicators belong to each type and are listed as in Table 1.The training set (first 40 samples in Table 2) and the testing set (last 20 samples in Table 2) used in empirical analysis are described in Table 3.Here, the training set is used to acquire the parameters of forecasting models, while the testing set is used to measure the forecasting performance of forecasting models.

Experiment results and analysis
In the empirical analysis of corporate financial distress forecasting, financial state of listed companies in period T is dependent variable/output and variables in period T-1 are input.Normalization of variables in period T-1 is implemented and the normalized value is listed in Table 4.  6 and 7. From the results in Table 5, the accuracy by BPNN and SVM is 80%, and that by LR-BPNN and LR-SVM is 85%, which is much higher than the accuracy 50% by LR.
In addition, by comparing the results in Tables 5-7, we can find that the total accuracy decreases in the length from current period to period T, i.e. total accuracy in period T-1 is better than that in period T-2, which is better than that in period T-3.Therefore, we can conclude that artificial intelligence techniques, i.e.SVM and BPNN can achieve better forecasting performance than LR.In addition, as SVM/BPNN can recognize the forecasting errors in fitness by LR, ensemble learning approaches LR-BPNN and LR-SVM can achieve better forecasting performance than individual methods when the forecasting results by LR are introduced into the SVM/BPNN technique as an input variable.

Conclusions and future work
In this study, logistic regression (LR) model is integrated with artificial intelligence tools, i.e. support vector machine (SVM) and back-propagation neural networks (BPNN), for corporate financial distress forecasting in fashion and textiles supply chains.Empirical analysis of Chinese listed companies in fashion and textile sector is implemented for the comparison of the methods, and some related issues are discussed.
The contribution of this study is that a novel ensemble learning approach is developed for corporate financial distress forecasting in fashion and textiles supply chains.In the framework of the proposed ensemble learning approach, the forecasting results by LR are introduced into the SVM and BPNN techniques which can recognize the forecasting errors in fitness by LR.The results suggest that artificial intelligence tools are better than LR and the proposed novel ensemble learning approach can achieve better forecasting performance than that of individual models.
By using the proposed approach, managers in fashion and textiles companies can predict the financial state of their suppliers, manufacturers and retailers in advance and give a quick response for better supply chain performance.
It is expected that future research would benefit from concentrating on other methods for corporate financial distress forecasting, using data from a wider sample of fashion and textiles companies.
hyperplane separates the training examples with the maximum distance from the separating hyperplane to the closest training data samples.The training examples closest to the maximum-margin hyperplane are called support vectors.All other

4. 1
Data description and experiment designFashion and textile sector is a traditional main industry in China.In recent years, China's export in fashion and textile sector is more than 25% in the global trade.However, many fashion and textile companies have been shocked immensely since the financial crisis in 2008.When a fashion and textile company suffers from financial distress, other companies in its supply chains will be subjected to financial risk.Therefore, financial distress forecasting is important in fashion and textiles supply chains.In this study, the data for our experiment were collected from the Shanghai Stock Exchange and Shenzhen Stock Exchange databases in China.In 2012, there are 88 fashion and textiles related listed companies such as Shenzhen Victor Onward Textile Industrial Co., Ltd.(000018), Xinlong Holding (Group) Company Ltd. (000955), Hubei Maiya Co., Ltd.(000971), Shandong Demian Incorporated Company (002072), Lanzhou Sanmao Industrial Co., Ltd.(000779), Ningxia Zhongyin Cashmere Co. Ltd. (000982), Sichuan Langsha Holding Ltd. (600137), Hunan Huasheng Co., Ltd.(600156), Xinjiang Tianshan Wool Tex Stock Co., Ltd.(000813), Nanjing Textiles Import & Export Corp., Ltd.(600250), Henan Xinye Textile Co., Ltd.(002087) and so on, where numbers in brackets are stock codes of corresponding listed companies.

Table 1 Definition of variables
The data set consisting of 60 samples are listed in Table2, where variable y represents financial distress state (0 means no financial distress, and 1 means financial distress) in period T and x i (i= 1, 2, …, 12) represents the value of the ith variable in period T-1.In this study, x i is independent variable and y is dependent variable.

Table 2
Data of the sample set