The Construction and Empirical Analysis of the Company’s Financial Early Warning Model Based on Data Mining Algorithms

. With the rapid advancement of the informatization process, enterprise informatization management has received more and more attention. Facing the increasingly complex and changeable social and economic environment, the diﬃculty of enterprise risk management has gradually increased. How to establish an eﬃcient risk management mechanism for early warning of corporate risks is the goal that companies seek. Traditional statistical analysis can no longer satisfy the processing of massive ﬁnancial data. Therefore, how to ﬁnd useful information for the ﬁnancial risk early warning management of the enterprise from the large amount of ﬁnancial data information generated by the business activities of the enterprise is a problem that enterprises urgently need to solve at present. The continuous improvement and innovation of data mining technology and the good performance of research and analysis of massive data have made the two closely linked. First, this study introduces the theories of ﬁnancial risk early warning and data mining technology; second, it introduces the research process of ﬁnancial risk early warning model and elaborates the three data mining techniques used in this study; then combined with the actual situation of listed companies in my country, it constructs ﬁnancial risk early warning index system; and ﬁnally, 77 listed manufacturing companies and their matching companies that were ﬁrst processed by STin 2005-2007 were used as research samples, based on the ﬁnancial data of the 2.4years before being processed by ST and CXISP. It is found that the ﬁnancial risk early warning model established by data mining technology has strong early warning capabilities. From the perspective of the prediction capabilities of the three models, the closer the time to ST, the higher the accuracy of the prediction; from the perspective of short-term early warning, the three models have better prediction eﬀects, but from the perspective of long-term early warning, the prediction eﬀects of neural networks and decision trees are better than logistic regression of statistical analysis; data mining techniques based on knowledge discovery are not only suitable for short-term early warning but also for longer-term early warning. Therefore, data mining can be applied to ﬁnancial risk early warning analysis to achieve the purpose of using data mining technology for decision support.


Introduction
With the rapid development of the market economy and the complex and changeable social and economic environment, business activities cannot be effectively carried out.e financial activities of enterprises are volatile, and the results become unpredictable, which in turn makes the enterprise profitable, or suffer losses, or even go bankrupt.e improper management of corporate financial risks at home and abroad over the years has caused companies to fall into financial crises, or even bankruptcies, which has brought certain resistance to the development and advancement of the entire economy and society.
In China, listed companies have not paid much attention to the management and control of financial risks, and the socialist market economy with Chinese characteristics is still being gradually improved.Companies lack the experience and foundation in this area, highlighting the financial risks of Chinese companies.Prevention and control are extremely fragile, and some effective risk management and analysis methods are not very mature.Compared with developed countries, there is still a big gap.erefore, in the face of the ups and downs of the financial market and the rapidly changing external environment, corporate finance should have done something to prevent risks.As a result, this study analyzes the financial risk analysis and early warning research of listed companies and finds that it is still in a process of exploration and improvement in this field.It is precisely because of these internal and external causes that many listed companies are relatively passive in the face of financial risks and cannot analyze and prewarn risks through a good financial risk analysis and early warning system [1][2][3][4][5].
Financial risk analysis and early warning is an early warning mechanism and real-time management control method established to prevent business operations from deviating from the normal operation trajectory.Early warning of the company's status is from a financial perspective, through the analysis of the company's financial statements and related business activities, to predict the risk status of the company.Researchers at home and abroad have conducted long-term research on how to effectively analyze and prewarn corporate risks [6][7][8][9][10].
Fitzpatrick first proposed the use of a single ratio to research and forecast financial risks.He selected 19 sample companies, studied and analyzed the data of the sample companies according to the criteria of the financial crisis, and divided these companies into well-functioning companies and crisis companies.e two financial indicators that distinguish the higher financial capability of the enterprise are analyzed.American scholar Beaver improved the univariate analysis model and put forward the use of five financial indicators as the judgment of financial crisis in his published paper.He selected 79 groups of companies based on industries and assets to conduct research and forecasts and finally found two indicators to determine the best financial capabilities of companies.Altman, a teacher at New York University in the United States, first adopted a multiple linear methods to predict the problem of financial risk crises.From five aspects, 22 financial indicators were used as discriminant variables, 33 bankrupt companies and 33 nonbankrupt companies were selected for sample research, using multivariate statistical discriminant analysis methods, and the final research analysis selected the five most useful indicators as discriminant variables, on the basis of analyzing these variables, conducted a comprehensive analysis, and established a multivariate linear Z-Score model.Since the establishment of the Z-Score model, many major changes have taken place in the financial statement standards in order to adapt to the new standards.Altman improved the Z-Score model in 1977 and proposed the second-generation improved Zeta model.e improved model can predict companies that may experience financial crises within five years [11][12][13][14][15][16].
C. E. Bonafede uses Bayesian networks to analyze the risks of different experts and the data obtained by scoring methods, which are more effective in the analysis of enterprise risks and can be used to evaluate enterprise risks.Hallna Frydman used the Markov hybrid model to give early warnings to the financial status of enterprises.Research shows that macroeconomic cycles and industrial economic cycles have a good early warning effect on corporate financial risks.Joseph P. Herbert and others used rough set theory to find out the relationship between the financial indicators of the enterprise and the financial risk of the enterprise, so as to predict the financial risk of the enterprise.Hyunchui Ahn et al. used the idea of genetics to predict the bankruptcy of enterprises.Marco Van der Burgt uses wavelet analysis to analyze financial risk early warning.Research shows that business cycles have a better early warning effect on corporate financial risks.Patrick L. Brockett uses riskdriven optimization algorithms to research and analyze the corporate capital structure and conduct risk management and evaluation on corporate business activities [17][18][19][20][21].
With the continuous development of computer technology and information technology, in 1995, at the American Computer Conference (ACM), the concept of data mining was proposed, as shown in Figure 1.As a brand-new commercial information processing technology, data mining has the ability to extract hidden information from databases the ability to contain, unknown, and potentially useful information.Common data mining techniques can be divided into statistical analysis and knowledge discovery.Statistical analysis data mining techniques mainly include regression analysis, cluster analysis, and correlation analysis.Knowledge discovery data mining technologies mainly include neural networks, decision trees, expert systems, rough set theory, and genetic algorithms.Since the 1980s, Western researchers have begun to use knowledge discovery data mining techniques such as artificial neural networks, expert systems, and genetic algorithms to conduct early warning research on financial risks.e application of these methods in corporate financial analysis and rating shows their advantages and potentials, that is, there is no restriction on the sample distribution, a wide range of applications, and a more unique way of solving problems.erefore, the use of knowledge discovery data mining technology is an important development trend in the research methods of financial risk early warning problems.Lapedes and Fyaber used the neural network model for the first time to predict and analyze the bank's credit risk.Wilson and Sharda used the neural network system to effectively predict the company's bankruptcy with an accuracy rate of 97% and effectively demonstrated that the use of the neural network-based system for analysis is more advantageous than the discriminant analysis method when predicting company bankruptcy [22][23][24][25][26].
e research of Sung, Chang, and Lee proves that the model based on the inductive rule method can be used to more accurately predict bankruptcy than the multiple discriminant analysis models under the changing economic environment.Myoullg JongKirIl and IngooHaIl compare and analyze the advantages of using genetic algorithms in bankruptcy prediction.In addition to using the above methods, FrydmaIl uses decision tree theory; Elmer and Borrowski use expert systems; Dimitras uses rough set theory; and Zopounidis uses multicriteria decision theory to construct financial risk early warning models.Feng Yu Lin and Sally McClean used data mining methods for early warning of company financial risks.e verification results show that under the same conditions, the mixed mode is better [27][28][29][30][31][32][33].
Traditional statistical analysis can no longer satisfy the processing of massive financial data.erefore, how to find useful information for the financial risk early warning management of the enterprise from the large amount of financial data information generated by the business activities of the enterprise is a problem that enterprises urgently need to solve at present.e continuous improvement and innovation of data mining technology and the good performance of research and analysis of massive data have made the two closely linked.erefore, this study uses data mining technology to research and analyze corporate financial risk early warning, which has very important theoretical and practical significance.

Financial Risk Analysis and Early Warning Theory
Financial risk analysis and early warning is an early warning mechanism and a real-time management control method established to prevent business operations from deviating from the normal operation trajectory.Financial risk analysis and early warning are a series of management and control methods established in advance by enterprises in order to respond to various financial risk conditions.Financial risks objectively exist and are predictable within a certain range, enabling companies to predict in advance the losses and impacts of various financial risks on the company and predict financial risks through early warning models, so that companies can be adequately prepared to deal with corresponding risks.Enterprise financial risk analysis and early warning analysis and early warning and standardized management of financial risks, mainly including monitoring, identification, diagnosis, and evaluation, of 4 content are carried out.(1) Monitoring.rough the establishment of enterprise monitoring information data, the financial activities of the enterprise are monitored, using a scientific and reasonable indicator system, and processing in accordance with the principles of standardization, scientification, and proceduralization.(2) Identification.rough the analysis of the monitoring information data, we can determine which link in the financial activities of the enterprise has occurred or will badly happen.(3) Diagnosis.Financial diagnosis mainly analyzes the monitoring data and uses enterprise diagnosis technology and financial analysis methods to make scientific judgments on whether the financial activities of the enterprise are good or bad.( 4) Evaluation.Financial activities are diagnosed with inferiority status to conduct a comprehensive assessment and analyze the losses that will be brought to the enterprise and the social losses.Financial risk analysis and early warning methods can be divided into two categories: qualitative and quantitative.(1) Qualitative analysis method is necessary.(2) Standardized survey method means that the problems raised by enterprises are universally applicable to all enterprises.Experts in this field conduct comprehensive research and identification on the problems encountered by the enterprise and form a detailed report file for the relevant personnel of the enterprise.However, for some specific enterprises or special problems of enterprises, the report document cannot explain or give a comprehensive explanation, which also exposes the limitations of this method.(3) "Four-stage symptom" analysis method: the poor state of corporate financial activities leads to a crisis.is is a gradual process with specific symptoms.is symptom can be generally divided into four stages, and the status of each stage is shown in Figure 2. If an enterprise finds that similar symptoms appear, it should promptly investigate the problem and take corresponding measures to resolve the crisis, resulting in the normalization of corporate financial activities.( 4) " ree-month capital turnover table" analysis method: the principle of this analysis method is to use the enterprise to formulate a three-month capital turnover table to judge the financial risk status of the enterprise.First of all, if the company cannot formulate the table, it indicates that the company has problems.e theoretical idea is that an enterprise's operating income is increasing, its disposable cash flow is increasing, and its capital turnover is abundant.On the contrary, it is difficult for the enterprise to maintain normal operations.(5) Management scoring method: this method analyzes the crisis of corporate financial activities as a series of behaviors.Some unsuitable financial operations will cause a financial crisis or even more serious financial problems.e theory divides the risk factors into three categories, namely, operation confirmation, operation error, and bankruptcy symptoms.Due to the excessive reliance on the executive judgment, the management scoring method still has some shortcomings.Early warning of the company's status from a financial perspective, through the analysis of the company's financial statements and related business activities, is to predict the risk status of the company.Researchers at home and abroad have conducted long-term research on how to effectively analyze and prewarn corporate risks.
e quantitative analysis methods can be further divided into univariate analysis and multivariate analysis.Univariate analysis refers to an analysis method that uses a single variable and a certain financial index to predict the financial risk of a company.According to this analysis method, when several financial indicators involved in a company's financial risk early warning model gradually deteriorate, it is usually a signal that the company is about to have a financial crisis.According to its ability to predict, the ratio of financial crisis prediction used can be divided into three system indicators.From these three system indicators, financial indicator data are used to evaluate the company's own conditions from time to time.If a certain indicator or certain indicators of an enterprise have a defined change, the cause should be discovered as soon as possible and measures should be taken to minimize the financial crisis.
e multivariate analysis predicts whether a company's financial status is in crisis from the overall macroperspective.
is analysis method studies multiple financial indicators, uses ratios and formula calculation results to measure the financial status of the enterprise, and predicts whether the enterprise has or will have a financial crisis.

Data Mining Technology
e emergence and development of data mining technology allow people to use data to mine useful and hidden business and scientific information.Since data mining is a multidisciplinary research field, its concept contains rich connotations.
is study gives the definition of data mining from a technical point of view and a business point of view.
e interpretation stage of the results interprets and evaluates the patterns and knowledge discovered by data mining according to the purpose and needs of the end user, extracts the most valuable information, and assimilates it into knowledge, specifically including eliminating irrelevant redundant patterns and filtering that show the information to be presented to the user.e purpose of financial risk early warning of listed companies is to automatically derive the model of the relationship between data and financial risks from the historical financial data records, so as to provide early warning of the financial risks of listed companies in the future.e early warning results involve two types of problems: financial crisis companies and financially normal companies, which is a data mining classification problem.erefore, to establish a financial risk early warning model, it is necessary to select data mining classification techniques.Currently, data mining techniques used in classification and prediction mainly include the following: (1) e classification techniques in statistical methods mainly include regression analysis and discriminant analysis.Commonly used regression analysis can be divided into linear regression, logistic regression, and probability ratio (probit) regression models according to the mutual relationship between variables, and the latter two are nonlinear regression models.Discriminant analysis is divided according to the standard of establishing discriminant function, mainly including distance discrimination, Fisher discriminant, and Bayes discriminant method.(2) e classification techniques in the knowledge discovery category mainly include neural networks, decision trees, association rules, rough sets, genetic algorithms, and so on.Decision trees use the entropy theory of information theory to select attributes to classify data; neural networks use neural network construction models to classify and predict data; genetic algorithms use the idea of biological evolution to achieve optimal classification; association rule classification methods are a kind of the new data mining classification method that is based on the apriori algorithm as the core classification method; and rough set uses rough set theory to classify data with discrete attribute values.
Various classification data mining technologies have their own advantages and disadvantages.When choosing an algorithm, it is necessary to maximize their strengths and avoid weaknesses.
erefore, this study intends to use a variety of data mining techniques to improve the efficiency of data mining.
e data mining process is shown in Figure 3.
e interpretation stage of the results interprets and evaluates the patterns and knowledge discovered by data mining according to the purpose and needs of the end user, extracts the most valuable information, and assimilates it into knowledge, specifically including eliminating irrelevant redundant patterns and filtering that show the information to be presented to the user.Visualization technology is used to transform meaningful patterns into graphical or logical visualization, and they are converted into user-understandable language.
e accuracy of prediction is still satisfactory, but the requirements for users are very high.With the continuous enhancement of computer computing power, some emerging technologies have also achieved good results in the field of knowledge discovery, such as neural networks and decision trees.With enough data and computing power, they can almost automatically complete many valuable functions without human attention.Data mining is an application that uses statistics and artificial intelligence technology.It encapsulates these sophisticated technologies so that people can perform the same functions without mastering these technologies and focus more on the problems they want to solve.

Construction of Early Warning Model of Enterprise Financial Risk
e design of the early warning model research process in this study is shown in Figure 4. e process includes the following: (1) selecting the sample enterprises according to certain criteria means selecting the companies that have experienced financial crisis and the matching financially normal companies.(2) Construction of financial risk early warning indicator system.e main purpose is to determine the financial data indicators that have an early warning effect on the financial risks of listed companies and obtain financial indicator data for 2.4 years before the financial crisis.
(3) T-test is the selected financial indicators.rough t-test and mean comparison, the characteristics of the mean difference between ST companies in financial crisis and non-ST companies with normal finances are analyzed, the causes of financial risks in STcompanies are explored, and the input variables of the early warning model with significant differences are screened out.( 4) e factor analysis model is used to further filter and reduce the input variables of the model.
e purpose is to eliminate the multicollinearity between the variables and improve the prediction accuracy of the model.( 5) e filtered financial index data are used to establish the Logistic regression model, neural network model, and C5.0 decision tree model, respectively.(6) Model evaluation.Model evaluation is mainly to evaluate the predictive effect of the model.In this study, the empirical test results of the three early warning models are compared and analyzed using test samples.Numerical and graphical evaluation standards are used to make the result evaluation more reasonable and objective.
e specific realization process of the model will be elaborated in the next chapter of empirical research.Figure 5 shows the line chart of the running time of the two algorithms on the financial index data.Visualization technology is used to transform meaningful patterns into graphical or logical visualization, and they are converted into user-understandable language.

Factor Analysis.
e basic principle of factor analysis is to use linear transformation to form a new set of unrelated comprehensive indicators to replace the original indicators, so as to avoid the collinearity of variables without losing the main information.e problem is convenient for further analysis.It is the promotion and development of principal component analysis, and it is also an important method of dimensionality reduction.e factor analysis model can be expressed as follows: It is written in matrix form as follows: en, it becomes the following: If the components of Z are not related to each other, a special form of factor analysis is formed, which is called principal component analysis.e mathematical model can be written as follows: It is written in matrix form as follows: e proportion of the principal components in the total variance successively decreases.
eoretically speaking, m � p, that is, there are as many principal components as there are variables.Journal of Mathematics crisis at Y � 0, and P 1 � l − P represents the probability of a listed company not having a financial crisis at Y � 1. e comparison of the three methods is shown in figure 6.

Logistic Regression
en, x k is the independent variable, β k is the regression coefficient corresponding to the independent variable, and α is the intercept.Among them, the intercept and regression coefficients are estimated by the maximum likelihood method.
With the rapid advancement of the informatization process, enterprise informatization management has received more and more attention.Facing the increasingly complex and changeable social and economic environment, the difficulty of enterprise risk management has gradually increased.e leaf node gets the conclusion.Decision tree construction can be carried out in two steps, namely, decision tree generation and decision tree pruning.Common decision tree algorithms are ID3, C4.5, C5.0, and so on.Since Clementine provides the C5.0 algorithm as an algorithm for decision tree modeling, the process of generating the C5.0 decision tree is introduced below.
e forecast accuracy of the three models is compared in Figure 7.
e methods for dividing decision tree attributes are as follows: Information entropy of category: e category conditional entropy divides the set T according to the attribute V, and the category conditional entropy after segmentation is as follows: Information gain (Gain), namely, mutual information Information gain rate Based on the above principles, this study selects 24 indicators that reflect corporate profitability, growth ability, debt solvency, asset management ability, and cash flow status to construct an early warning indicator system.
(1) Reflecting the profitability index of a company.
Profitability is the ability of a company to increase its capital.It is a prerequisite for the survival and development of a listed company, and it is a comprehensive manifestation of financial structure and operating performance.If the company's profitability is stable, it will have enough surplus to face various possible financial risks, and the possibility of a financial crisis will be lower.erefore, financial indicators related to the profitability of listed companies are also used as important indicators for establishing financial risk early warning models.is study selects five indicators including earnings per share, return on net assets, return on total assets, net profit rate, and main business profit rate to reflect the profitability of listed companies.Different data are compared in Figure 8.
(2) Reflecting the growth ability index of an enterprise.erefore, the financial indicators that reflect the ability of business development are also among the important indicators for investigating the financial risks of listed companies.
is study selects five indicators including net asset growth rate, total asset growth rate, operating profit growth rate, main business income growth rate, and net profit growth rate to reflect the growth ability of listed companies.
(3) Reflecting the solvency index of an enterprise.Debt solvency is the ability of an enterprise to repay its debts as they fall due.Regardless of the level of debt, repaying debts at maturity is the basic prerequisite for the normal operation of listed companies.Once the solvency of a listed company significantly drops, or even becomes insolvent, it may lead to financial risks or even bankruptcy of the company.erefore, financial indicators related to solvency are often used as important indicators to examine the financial risks of listed companies.is study selects five indicators, namely, current ratio, quick ratio, shareholder's equity ratio, equity ratio, and asset-liability ratio to reflect the solvency of listed companies.e predicted value is shown in Figure 9.With the rapid advancement of the informatization process, enterprise informatization management has received more and more attention.Facing the increasingly complex and changeable social and economic environment, the difficulty of enterprise risk management has gradually increased.(4) Reflecting the corporate asset management capability index.e asset management capability reflects the utilization status of corporate assets, and it is closely related to the financial status of the enterprise.Listed companies with good business performance should have better asset management capabilities.erefore, financial indicators that reflect asset management capabilities are also important indicators for investigating the financial risks of listed companies.is study selects four indicators including inventory turnover rate, accounts receivable turnover rate, total asset turnover rate, and fixed asset turnover rate to reflect the company's asset management capabilities.(5) Reflecting corporate cash flow indicators.ere are few domestic research results on the use of cash flow indicators for financial risk early warning.However, the cash flow status of listed companies has received more and more attention, and the poor cash flow status is one of the important reasons for the financial risks of companies.erefore, this study increases the selection of cash flow indicators. is study selects five indicators including cash flow debt ratio, cash debt ratio, cash flow from operating activities per share, cash sales ratio, and cash recovery rate of total assets to reflect the cash flow status of listed companies.