Analysis of Financial Risk Early Warning Systems of High-Tech Enterprises under Big Data Framework

With the further development of China’s market economy, the competition faced by companies in the market has become more intense, and many companies have diﬃculty facing pressure and risks. Among the many types of enterprises, high-tech enterprises are the riskiest. The emergence of big data technologies and concepts in recent years has provided new opportunities for ﬁnancial crisis early warning. Through in-depth study of the theoretical feasibility and practical value of big data indicators, the use of big data indicators to develop an early warning system for ﬁnancial crises has important theoretical value for breaking through the stagnant predicament of ﬁnancial crisis early warning. As a result of the preceding context, this research focuses on the inﬂuence of big data on the ﬁnancial crisis early warning model, selects and quantiﬁes the big data indicators and ﬁnancial indicators, designs the ﬁnancial crisis early warning model, and veriﬁes its accuracy. The speciﬁc research design ideas include the following: (1) We make preliminary preparations for model construction. Preliminary determination and screening of training samples and early warning indicators are carried out, the samples needed to build the model and the early warning indicator system are determined, and the principles of the model methods used are brieﬂy described. First, we perform a signiﬁcant analysis of ﬁnancial indicators and screen out early warning indicators that can clearly distinguish between ﬁnancial crisis companies and nonﬁnancial crisis companies. (2) We analyze the sentiment tendency of the stock bar comment data to obtain big data indicators. Then, we establish a logistic model based on pure ﬁnancial indicators and a logistic model that introduces big data indicators. Finally, the two models are tested and compared, the changes in the model’s early warning eﬀect before and after the introduction of big data indicators are analyzed, and the optimization eﬀect of big data indicators on ﬁnancial crisis early warning is tested.


Introduction
With the further development of China's market economy, the competition faced by companies in the market has become more intense, and many companies have difficulty facing pressure and risks. If a corporate crisis fails to be effectively controlled and adjusted, it will not only affect a single company, but also affect other players in the market, such as creditors, investors, securities, and the banking system. erefore, the financial crisis of enterprises is a problem that affects the stability of the capital market [1,2].
Among the many types of enterprises, high-tech enterprises [3,4] are the riskiest. One of the characteristics of this type of enterprise is that high risks bring high returns. In international competition, China's high-tech enterprises have to face the impact of foreign enterprises' new technologies and strong capital, and the instability of enterprises is greater than that of ordinary traditional enterprises. However, high-tech enterprises are an industry that any country must vigorously promote and support. is industry represents more advanced technology and development trends and is more in line with the themes and trends of the times, while the high pollution and high energy consumption of traditional enterprises will gradually take part in the competition [5,6].
Based on the review and summary of the theoretical methods of financial crisis early warning systems both domestically and internationally [7][8][9], this paper puts forward the reasons for the financial crisis of high-tech enterprises in the Chinese market and then establishes the early warning of this type of enterprise based on the logistic regression method in the financial early warning model. e meaning of this article's topic selection is as follows: (1) rough modeling and analysis, we put forward the research ideas for establishing China's high-tech enterprise model. Foreign research on financial risk early warning has been considerable and mature, but it is difficult to directly apply foreign early warning mechanisms to the Chinese market. After all, the economic environment and the companies thmeselves are different. ere is also domestic research in this field, but this article carries out research and analysis of high-tech enterprises in order to obtain a more targeted financial early warning system and operating mechanism. (2) We establish a targeted early warning model to improve the level of high-tech enterprises in dealing with financial crises. Under the market economy [10], business operations are full of risks. While enjoying the high returns brought by risks, companies are also suffering from the same crises. How to make enterprises have a place in the market competition without going into financial distress requires a set of practical crisis early warning systems to assist enterprises. e financial crisis early warning system not only give signals to companies before the crisis, but also helps companies strengthen their own management and operation level, actively discover the abnormal financial phenomena of the company in daily operations, and work hard to propose solutions to deal with risks or crises. (3) It is conducive to the supervision and management of enterprises by relevant state departments and organizations. First of all, a large number of high-tech enterprises are state-owned enterprises. ese enterprises use state-owned capital, so the state as the owner has the right to supervise and manage the operation status and operating results of the enterprise. e financial early warning system is an effective tool which can encourage enterprises to have the pressure and sense of mission to work hard to maintain and increase the value of state-owned capital. Secondly, the early warning system is also a new way of supervision for the China Securities Regulatory Commission. In the current securities market, one of the evaluation and management methods for listed companies [11,12] is to carry out special treatment (such as ST); then, the CSRC can also try to use the financial early warning model as a company evaluation standard to measure the crises and risks threatening a company.
Key contributions of the proposed study include the following: (i) An accurate early financial risk warning system is proposed for the high-tech enterprises (ii) e key influential big data indicators are identified for accurate modeling (iii) Regression modeling of the key indicators was used e rest of the paper is structured as follows: Section 2 describes the related work in the subject area. Section 3 describes the proposed methodology adopted for achieving the objectives. Section 4 validates the proposed methodology using experimentation. Section 5 summarizes the work done.

Related Work
e research of big data in the field of financial crisis warning originated from the research of big data in the financial market. Enormous research conducted in this area by predecessors focused on the impact of big data on the financial market. e big data generated by network information can be quantified to accurately predict financial indicators. A large number of studies on big data in the financial field have aroused some scholars' interest. Scholars around the world have done less research on big data. ere have been studies on the feasibility of introducing big data into the field of financial crisis early warning. From the perspective of the application of big data in the financial market, existing research generally collects information on network platforms and discovers the emotional tendencies and development trends of netizens from the information. Ranco et al.'s research shows that many seemingly irregular phenomena are displayed on the Internet platform, and through collection and analysis, it is found that a large number of Internet users' interactions are closely related to changes in the financial market. e research verifies that sentiment analysis of Twitter platform information can accurately determine the accumulation [16]. Based on this, Zheludev et al. also analyzed the Twitter platform information statistics and reached new conclusions. e indicators derived from the sentiment judgment of netizens' comments have improved the prediction accuracy of the Standard and Poor's 500 [17]. Zhang and others (from MIT) also selected the most active Twitter platform as the indicator collection source. After judging the sentiment index of netizens, they found that the rise of negative sentiment would lead to the decline of the Dow Jones Index, S&P 500 Index, and Nasdaq Index [18]. Tobias Preis et al. used the search behavior of netizens to predict the number of investors' buying and selling. e method adopted was to count the retrieval volume of corresponding words in a certain period of time, which is also a type of big data indicator application [19]. Domestic scholars also have similar research in this area. Li Jinhai and others collect online comment information, analyze the emotional tendencies of netizens through text mining, and design an early warning system for corporate Internet word-of-mouth to help companies understand the current status and future trends of online public opinion [20,21]. In summary, researchers attach great importance to the impact of big data in the financial field. Bian Hairong believes that online information big data affects investor sentiment and is reflected in stock transactions because the information contained in online information big data has the characteristics of timeliness, comprehensiveness, and diversity [22,23]. Network information big data contains many sources of information, including reports and data disclosed by listed companies themselves; authoritative media reports on listed companies; and opinions, evaluations, and trend forecasts from the media and industry experts. Among the many network platforms, the stock bar platform is the most active and valuable for research. e stock bar platform is an important medium for information transmission between companies, media, institutional experts, and investors. e big data it generates is extremely valuable for studying the changes in the stock price and financial status of companies. While the impact of big data on financial position has been studied, there is a lack of evidence to support it. Whether it can help improve the effect is a very valuable research topic [24]. erefore, judging from the feasibility, the cause is inseparable from the interaction of corporate stakeholders on various channels or platforms. e traces of people's behavior on the Internet will continue to form big data that will bring endless value to academic research. e financial status of an enterprise can be reflected through network information.
e stakeholders of an enterprise express their opinions through the Internet. e emotional tendency formed by a large number of opinions represents the management and financial status of the enterprise. rough related theoretical sociological analysis, it can be considered that network information big data has strong objectivity. e results obtained from the total statistical analysis should be unconscious. Such objective and scientific data can provide help for financial crisis early warning. Not only is big data closely related to the financial status of enterprises, but also the quantitative processing is performed through computer natural language processing technology, and the results are more objective. erefore, the indicators formed by the quantitative processing of big data can solve the one-sided, subjective, and difficult-to-quantify nonfinancial indicators in the past. e problem is of great significance to the study of financial crisis early warning. In summary, the information in social platforms is closely related to social and economic activities. Big data on network information provides new ideas and methods for exploring the status and trends of social economy. e introduction of big data can improve the selection of nonfinancial indicators. Although relevant research is still in its infancy, big data can make up for the limitations of nonfinancial indicators.
In terms of big data acquisition technology, network information data has various forms and large capacity. In the traditional mode, the search engine can significantly expand the collection's geographic reach, but the relevance and value of the information are low, and filtering information with low value density requires manual supervision and repeated screening, which is extremely inefficient. Scholar DuAn once pointed out that precise positioning of data can help to mine highly relevant and valuable data [25]. erefore, intelligently locking the subject and collecting it constitute an effective way to obtain big data. Web crawlers can crawl relevant topic information in the network and can filter out irrelevant information according to requirements [26]. Compared with the traditional model, the web crawler covers the complete range of information, locks the target subject, and can collect and organize information while ensuring relevance and value, greatly improving efficiency. In terms of the quantitative analysis technology of big data, existing research mainly quantifies the big data generated by network information into two indicators: information volume and emotional tendency. e quantification of the amount of information is relatively simple, that is, the number of texts of network information is counted. e analysis of emotional tendency is more difficult. e big data information text collected on the Internet can generally be divided into the three forms of vocabulary, sentence, and document from the text content. e sentiment tendency analysis of the text requires different algorithms according to the form of the corpus. Big data of network information has various language styles, grammatical forms, and emotional characteristics. For this kind of corpus, scholar Tang Xiaobo pointed out that the research can be carried out by expanding the emotional dictionary. First, the emotional tendency of the words in the text is judged through the dictionary comparison, and then the emotional tendency of the entire sentence or the entire document is calculated by the algorithm [27]. is paper studies the introduction of big data indicators into the field of financial crisis early warning and collects a large amount of relevant information on the Internet for early warning. Collection efficiency is an important prerequisite. With limited resources, it is necessary to obtain as effectively as possible relevant information on corporate topics to adapt to financial crises. For the indicator requirements of the early warning model, web crawlers are the best choice. In addition, the collected network information includes three fields: basic vocabulary, financial professional vocabulary, and network terms. According to previous research, it is planned to expand on the basis of the basic emotion dictionary.

Method
ere are many existing research methods: unary and multiple discriminant models, multiple logistic regression models (logistic regression), and artificial neural network models. rough comparison, we can see that although the unary decision model is widely applicable, its accuracy is not high. e independent variables of the multivariate discriminant model are required to be normally distributed, and the variables can be converted into categorical variables for processing through statistics. e accuracy is much higher than that of the unary linear discriminant, but the scope of application is limited. e multiple logistic regression model does not require normality for the distribution of the independent variables and has a wider range of applications. e assumptions and preconditions are also looser than those of the multiple discriminant model, and its accuracy is higher. However, some approximations are needed in the calculation process.
e artificial neural network model is more complicated, in which the calculation is difficult and the degree of stability needs to be mentioned. After comparison, this article chooses a more stable and rigorous logistic regression model as the modeling Scientific Programming 3 method. e basic form of the logistic regression model is as follows: In the model, P is the probability of a company having a financial crisis under the variable . , x m ) as the i-th factor that affects financial risk prediction, and a, b(j � 1, 2, 3, . . . m) are estimated parameters. e basic model can be deformed to get the probability of a financial crisis P: is article selects P � 0.5 as the cut of the model (cutoff point). When the calculated model value P > 0.5, it means that the possibility of being specially processed is relatively high. e company is considered to be a financial crisis company. e assumptions of the logistic regression model are as follows: (1) the dependent variable y is a dichotomous variable; (2) sample selection is random; (3) y is nonlinear with the independent variable x i ; (4) there is no collinearity between the independent variables.
is article takes high-tech companies as the research object takes the A-share listed companies of this type as a sample, and defines companies that are first in the forecast year by ST or * ST as financially distressed companies, while non-ST companies are defined as financially normal companies. is article first constructs an indicator system of financial early warning by selecting financial indicators and then uses statistical methods to build a logistic regression financial risk early warning model. Finally, the accuracy of the model is tested.
is article is a research on the improvement of the financial crisis early warning model with the introduction of big data indicators. It is necessary to reasonably quantify the indicators and analyze the forecast effects of the model after the introduction of new indicators. e specific research design ideas include 6 parts: First, select training samples. Select the training samples for establishing the model from the listed high-tech companies in China's A-share market. In addition, the remaining listed companies in the high-tech industry are used as test samples after the model is established to test the prediction accuracy of the model. Second, select early warning indicators. e early warning indicators studied in this paper include two parts: financial indicators and big data indicators. e selection of financial indicators refers to the existing literature and conducts a preliminary selection from the financial indicator analysis database of listed companies in China, paying attention to the comprehensiveness and systematicness of financial indicators. e selection of big data indicators is based on the above theoretical analysis, collected from a stock discussion community, paying attention to the relevance and availability of big data indicators. ird, select the model method. e research in this paper introduces an early warning indicator system and uses statistical analysis methods to determine whether a company is a financial crisis company or a normal company. is is a typical binary classification problem. e definition of the critical point of the scope can determine the type of company, which matches the research of this article.
erefore, the logistic model analysis method is selected to establish an early warning model. Fourth is the treatment of financial indicators. SPSS software is used as an analysis tool to detect the significance of the abovementioned primary financial indicators. After SW normal distribution test, T parameter test, or Mann-Whitney U nonparametric test, the main indicators are selected from 66 financial indicators as early warning variables to clearly distinguish between the two sample groups of financial crisis companies and nonfinancial crisis companies. Fifth is the processing of big data indicators. Use web crawler software to build a stock bar review corpus, apply python software to analyze the sentiment orientation of the collected stock bar review information, and extract the sentiment orientation and opinion enthusiasm from millions of comments. Sixth is the construction of the financial crisis early warning model. Construct a logistic regression model with processed financial indicators as variables and record it as the original model. en, introduce big data indicators and financial indicators together as variables to build a logistic regression model, which is recorded as an improved model. Compare and analyze the two models to verify the optimization effect of the improved model. e overall research and design ideas are also shown in Figure 1.

Normal Distribution Test.
In this paper, SPSS software is used for data processing.
e Shapiro-Wilk method (SW test) in the software can make a judgment on the normal distribution. It is suitable for nonlarge sample detection.
is article is used to judge the normal distribution of 66 financial indicators. ere are some basic principles: first, arrange the index data with normal distribution characteristics in order of category, and calculate the theoretical cumulative probability of each layer distribution; second, arrange the index data to be tested in order of category, and obtain the empirical cumulative probability; third, compare and analyze the two sets of data, calculate the maximum deviation value, and confirm whether it belongs to the normal distribution according to the confidence. Assuming that all indicators obey a normal distribution, the significance standard value is determined to be 0.05, and the SW value and Sig value of all indicators are obtained by running the software. ere should be no acceptance of the null hypothesis when the Sig value is less than 0.05, and the indicator should be considered to have failed normal distribution test. On the contrary, if it is greater than 0.05, it is considered to pass the normal distribution test. From the analysis and test results of the 66 financial indicators, we draw the following conclusions: First, using the significance level of 0.05 as the standard, we have a total of 12 financial indicators: X2, X3, X5, X20, X28, X32, X34, X35, X36, X38, X52, X56. If the Sig value is greater than 0.05, the 12 financial indicators generally conform to the normal distribution through the test, and the T parameter test method can be used for the significance test. Second, with the significance level of 0.05 as the standard, the remaining 54 financial indicators such as X1, X4, X6, X7, X8, X9, X10, X11, X12 have Sig values less than 0.05, finally failing to pass the test, which means that they do not obey the normal distribution, and the Maim-Whitney U nonparametric test method should be used for the significance test.

T Parameter Inspection.
e indicators that pass the normal distribution test are tested for significance using T parameter analysis. From the results of the previous section, 12 indicators suitable for T-test are obtained. e test of such indicators can obtain indicators that can clearly distinguish between nonfinancial crisis samples and financial crisis samples. To begin, establish a null hypothesis that the indicator mean values of the two samples of financial crisis firms and nonfinancial crisis companies are not significantly different from one another. Make sure that the significance standard is 0.05. After the software is run, the T value and Sig value are sorted. If the value is less than 0.05, the null hypothesis should be rejected, and the indicators between the sample groups should be clearly different. e null hypothesis should be accepted if the value is higher than or equal to 0.05 and the indicators between the sample groups have no obvious difference. Insignificant indicators are not suitable for judging the financial crisis and should be removed. e test results are shown in Table 1. e analysis and test results lead to the following conclusions. First, with the significance level of 0.05 as the standard, the test values of X2, X3, X5, X32, X34, X35, a total of 6 indicators, are below the judgment standard value (<0.05), and these 6 indicators pass the test, which shows that the two types of samples can be fully distinguished and can be used in the model. Second, taking the significance level of 0.05 as the standard, the test values of the remaining 6 indicators, X20, X28, X36, X38, X52, X56, are above the judgment standard value (≥0.05). ese 6 indicators have not passed the test, and the explanation is not sufficient to distinguish the two types of samples and cannot be used in the model.

Mann-Whitney U Test.
For indicators that did not pass the normal distribution test, the Mann-Whitney U nonparametric analysis was used for the significance test. From the above processing results, 57 indicators suitable for nonparametric testing are obtained. Some of these indicators can significantly affect the occurrence of corporate financial crises, while some indicators have less significant impact on the occurrence of financial crises. In order to prevent the impact of insignificant indicators on the early warning effect of the model, it is necessary to conduct a significant analysis of the two samples of financial crisis companies and nonfinancial crisis companies and eliminate insignificant indicators. First, make the null hypothesis that there is no significant difference between the two samples index mean values, and determine the significance standard to be 0.05. After the software runs, the MW value and Sig value are sorted. If the value is less than 0.05, the null hypothesis should be rejected and the indicators between the sample groups should be rejected. ere is a significant difference, and if it is greater than or equal to 0.05, the null hypothesis should be accepted and there is no significant difference in the indicators between the sample groups; such indicators should be removed. Analyzing the test results, we draw the following conclusions: First, using the significance level of 0.05 as the standard, X1, X4, X7, X12, X14, X15, X16, X17, X18, X19, X21, X24, X25, X29, X37, X42, X45, X54, X55, X59, X60, X63, and others, a total of 24 indicators of inspection values, are being judged to be below the standard value (<0.05). ese 24 indicators pass the test and show that the two types of samples can be fully distinguished and can be used in the model. Second, taking the significance level of 0.05 as the standard, the test values of the remaining 32 indicators such as X5, X6, X8, X9, X10, X11, X13 are above the judgment standard value (≥0.05), and these 32 indicators have not pass the test and show that the two types of samples cannot be fully distinguished and cannot be used in the model.

Construction of Sentiment Dictionary for Big Data
Indicators. e specific dictionary composition includes the following aspects: first, the basic dictionary is mainly based on HowNet's emotional dictionary; second, the domain dictionary is mainly based on the securities operation vocabulary and the ready-made financial field emotional dictionary; third, the network term dictionary is mainly based on BosonNLP sentiment dictionary and SnowNLP sentiment dictionary, combined with other stocks sentiment dictionaries, and is derived from the annotation analysis of millions of articles in forums, post bars, and online media, as well as online comments.
rough the above-mentioned emotional dictionary expansion, an emotional dictionary that meets the quantitative processing of big data indicators is obtained. In addition to the basic negative adverbs and degree adverbs, the emotional vocabulary has reached more than 500,000. Examples of emotional words are shown in Table 2.

Big Data Indicator Processing
Results. Big data indicators come from the stock bar reviews, which include two indicators, sentiment value indicator and information popularity indicator. rough the above analysis, this article considers the big data indicators of T-3 and T-2 at the same time.
erefore, there are 4 subindicators, namely, the sentiment value index of T-3 year, the information popularity index of T-3 year, T-2 year's sentiment value index, and T-3 year's information heat index. e calculation of the enterprise's information popularity index is relatively simple, which is statistics of review information in the year, and the logarithm of the data value is too large. However, the quantitative processing of corporate sentiment value indicators is more complicated, including processes such as text segmentation, word frequency statistics, dictionary comparison, and sentiment judgment. Both indicators reflect the social network's attitude and evaluation of the overall status of the enterprise to varying degrees and reflect the management and financial status of the enterprise from another perspective, which is of great significance for supplementing the lack of financial indicators. After the above analysis and processing, two big data indicators can be obtained as shown in Table 3.

Effect Test and Comparative Analysis of the Two Models.
e big data indicators obtained through the above quantitative processing can be used to improve the research of financial crisis early warning. According to the analysis of the concept of big data indicators in Section 2, the big data indicators in this article are quantified from the comments of mass stock bars, and the majority of stakeholders of the enterprise behaviors and emotions show a certain value under a large number of repeated interactions, reflecting the financial status and development trend of the enterprise. e core of financial crisis early warning is the early warning function. Whether big data indicators are the cause of the financial crisis is still uncertain, but they cover a wide range of information and have a symptomatic effect on the early warning of corporate financial crisis. e comprehensive use of big data indicators and financial indicators can reflect the state of the enterprise to the greatest extent and can make up for and correct the shortcomings of simply using financial indicators in the early warning of financial crises. In the following, we will use the logistic method to analyze the early warning effect of the model before and after the introduction of big data indicators by testing sample data. First, establish a logistic financial crisis early warning model that uses financial indicators only, and then integrate big data indicators and financial indicators to establish a logistic model that introduces big data indicators. Finally, the early warning effects of the two models are tested and a comparative analysis is made to verify the important role of big data indicators in financial crisis early warning.        It can be found that the overall prediction accuracy rate is 88.89%. Specifically, the test sample has 36 samples, 12 ST samples and 24 non-ST samples. e improved model that introduces big data indicators judges 10 ST samples and 22 non-ST samples. e company's judgment accuracy rate was 83.33% and 91.67% respectively, and the overall forecast accuracy rate was 88.89%. In summary, comparing the test results of the two models, we can clearly see that in terms of overall prediction accuracy, the improved model with big data indicators has an accuracy rate of 11.11% higher than that of the original model without big data indicators. In terms of prediction accuracy, after the introduction of big data indicators, the model's accuracy in predicting ST samples increased by 25%. e test results show that the forecast accuracy of the financial crisis early warning model has indeed been improved after the introduction of big data indicators.

Conclusion
With the rapid development of big data thinking and technology, big data has penetrated almost all fields, affecting the development of all walks of life, and hence became the focus of attention in academia. is article is based on the industry effect of high-tech enterprises and the enthusiasm of netizens' exchanges, and it can well reflect the situation of enterprises through early warning indicators. erefore, we determine China's high-tech listed companies as the training samples of the model, select and quantify the big data indicators and financial indicators, design the financial crisis early warning model, and verify its accuracy. e following are the main work and research conclusions of this article. First, the introduction of big data indicators is an effective way to improve the effect of financial crisis early warning. e financial indicator information is too single, and the existing nonfinancial indicators are too subjective, which seriously impacts the effect of early warning. e development of big data has enabled more and more information to be collected and processed.
e existing results show that the review information of stakeholders on a company has had an important impact on the value of the company. e company is in an increasingly complex market environment and is involved in social activities. Financial crisis early warning applications should pay attention to the development and changes of network interactive information. It is difficult to obtain the complete and true financial situation of a company only by the information required by the regulatory authorities. e quantified big data indicator information can fully and objectively reflect the status of the company, which is an important way to improve the effect of the financial early warning model. Second, we build a financial crisis early warning model that introduces big data indicators and find that the predictive effect of the financial crisis early warning model has been improved after the introduction of big data indicators. We design enterpriserelated topics to collect network information big data, and judge the emotional tendency of the review text through the emotional dictionary to obtain the emotional value, that is, the big data indicator. Combining financial indicators, the logistic model is used to establish the financial crisis early warning model without the introduction of big data indicators. rough the effect test of inspection samples, the results show that the financial early warning model with big data indicators can improve the accuracy of financial crisis early warning.
Data Availability e datasets used are available from the author on reasonable request.  Scientific Programming