Borderless Fusion Financial Management Innovation Based on Speech Recognition Technology

In recent years, the rapid development of information technology has aected the operation of the world economy, and the emergence of e-commerce has decreased signicantly. e development of modern information technology, especially the development of Internet technology, provides a solid foundation for e-commerce. e time and space gap between economic operators is shortened, and resources are shared as widely as possible. Based on a comprehensive and in-depth study of language recognition technology. However, the current e-commerce nancial management still uses the traditional accounting method, which is inecient and cannot be integrated with new technologies. is paper proposes a design scheme for a nonspecic person and small vocabulary language recognition system based on SPCE061A processor. is integrated system can be combined with data acquisition and language input technology and constitute a data acquisition and input system. It is used for data collection and input of instruments and meters in the workplace to improve work eciency. is paper proposes algorithms based on timedomain energy and zero-crossing rate detection, detection based on entropy function method, and detection based on the doublethreshold method. e speech recognition technology is analyzed in detail, and a new model of nancial management is constructed. Accounting information disclosure is a virtuous circle of “promoting capital management with appropriate decentralization.” e experimental results in this paper show that through model construction, the calculated WER is 0 dm, 7 dm, 15 dm, and 11 dm, respectively, and it is concluded that the system exhibits good antinoise performance.


Background.
Language is one of the important ways for people to communicate, and it is the result of useful information that people retain in the long-term evolutionary process. With the advent of networked technology, in addition to communicating with people using language, it has become a dream for people to allow machines to understand human language.
is problem that was previously considered impossible to solve has now become possible. With the advent of the information age, the use of voice instead of computer keyboard input has become one of the future social development trends, and voice recognition technology has emerged at the historic moment and has become one of the hotspots of research by many researchers. With the continuous development of computer and arti cial intelligence technology, smart machines participate in human production activities and social activities. How to improve the relationship between people of these machines and make it easier for people to manage machines. More importantly, language is the best way for humans and machines to communicate. Speech processing is a cross-discipline between language and digital signal processing technology. It is an important method of computer intelligence and humancomputer interaction. e main technology of the interaction between humans and computers is speech recognition. erefore, the research of speech recognition technology has become the research focus of countries all over the world. e development of computers and the importance of social life are becoming more and more prominent. Various applications of voice coding technology have been developed by the product, including voice telephone exchange. Voice dialing systems, industrial control systems, etc., have penetrated into all sectors of society and have good application prospects. Boundaryless management was first aimed at the overall management of an enterprise. According to its meaning, the borderless theory is applied to the field of financial management and combined with the characteristics of integrated finance, a borderless integrated financial management model is proposed.

Significance.
Speech recognition is a comprehensive subject, including sound, computer science, signal processing, mathematical statistics, human intelligence, and engineering. e ultimate goal of speech recognition is to make a technology product that can communicate with people normally.
is product can convert the received sound information into corresponding text and execute the corresponding action, thereby realizing human-computer interaction (Human Computer Interface, HCx). At present, the speech recognition technology in a quiet environment is quite satisfactory, and the speech recognition rate is also very high. However, in real life, the speaker's voice is easily affected by the surrounding environment and various superimposed noises and cannot achieve the ideal recognition effect. Boundless integrated financial management is guided by corporate strategy, emphasizing that finance breaks through the existing work framework and mode with a borderless active management awareness, communicates and transmits financial concepts in all aspects of the value chain, and forms the integration of finance and other aspects.
e department implements the financial management model to promote the continuous growth of the overall value of the enterprise. erefore, improving speech recognition technology under noisy conditions and extracting accurate sound information has become the focus of researchers. In a society where the scope of information technology is gradually expanding, the development of research on speech recognition technology in noisy environments is of great significance to improving people's living standards and quality of life.

Related
Work. Human society increasingly shows the characteristics of an information society. Not only between people but also between people and machines, a large amount of information needs to be exchanged at any time. Voice is the most direct, most convenient, and effective tool for information exchange between people, and it is also an important means of communication between people and machines. Use the concept of big data to optimize the industrial structure and improve work efficiency. First, it outlines the main content and development changes of financial management. e latest research progress of digital construction financial budget management and cost management is analyzed. en combined with actual work, the main content and workflow of the financial management system are summarized and improved. Finally, an intelligent framework for modern hospital financial management is proposed, including budget management, cost management, and performance management. [1] Dagnew D K uses a parallel hybrid research method. Collect data through questionnaire surveys and interviews. Quantitative data is analyzed by descriptive and econometrics, and interview data is analyzed by narrative. In addition, the lack of guidance and consulting services, the high cost of training and consulting services, the complexity of most standards, and the lack of time for regulatory agencies are the most important challenges. e international comparability, transparency, and quality of the financial reporting system are one of the prospects for the implementation of IFRS. However, his data is not clear enough ( [2] Maria Ruiz-Jimenez J). ere is an ethical debate about the impact of gender diversity in the top management team (TMT) on the organization. e research aims to contribute to this debate by analyzing the influence of gender diversity of TMT on the relationship between knowledge combination ability and organizational innovation performance. A sample of 205 small and medium enterprises (SMEs) belonging to the Spanish Technology Company (TBF) sector was used. e research results show that gender diversity positively regulates the relationship between knowledge combination ability and innovation performance. ey discussed the impact on theory and practice among them, how to promote more equal gender distribution, and the benefits of gender diversity in senior management positions [3]. Although the analysis is in place, some theories do not have practical value. rough their research, it can be found that for financial management, most scholars still improve the efficiency of traditional methods and do not propose essential innovations in combination with current technology.

Innovation.
e innovation of this article is as follows: (1) First of all, the innovation of the topic selection angle. is article is a new perspective from the perspective of topic selection. At present, there is not much research on the integration of speech recognition, boundless fusion, and financial management. It is of exploratory significance. (2) e second is the innovation of research methods. e proposed detection based on time domain energy and zero-crossing rate, detection based on entropy function method, etc., has theoretical value.
(3) e other is the innovation of project practice. Boundless integrated financial management penetrates the financial concept into all aspects of production and operation, so that information communication can break the barriers of departments and professions, improve the information transmission, diffusion, and penetration capabilities of the entire organization, and realize the symmetrical distribution of information, experience and skills and sharing, thereby stimulating innovation and improving work efficiency, and realizing the optimal allocation of enterprise resources and the maximization of value creation. e project is of great significance to improving people's living standards and quality of life.

Based on Time Domain Energy and Zero-Crossing Rate
Detection. Speech endpoint detection is generally used to identify the presence and absence of speech in an audio signal. Short-term energy analysis and zero-crossing rate analysis are the most basic methods in time domain analysis of speech signals, and they are widely used, especially in the detection of speech signal endpoints. Since these two methods are usually used independently in speech signal endpoint detection, it is easy to miss important information during endpoint detection. Judging from the instability of the speech signal [4], its energy may change over time, and the energy of nonhuman voice and human voice is also very different. erefore, by analyzing the short-term energy and the short-term energy zero pass rate, it is possible to distinguish the speech segment with and without the speech segment. e short-term energy of speech sampling sequence g(i) is defined as follows: (1) Among them, W i represents the short-term energy of the i speech frame, h(i) � g(i) 2 . Set the windowed speech signal to f w (i), N represents the window length, and the shortterm energy [5] is expressed as follows: e endpoint of the speech signal is obtained by setting and comprehensively using two threshold levels, and the principle is simple, and the real-time performance and precision are relatively high, so it has a wide range of applications. e zero pass rate of the speech signal should be the number of times the signal crosses the branch in one second. When the speech signal is a continuous signal, the zero pass rate can be calculated as the number of times it passes the segmented signal. If it is an obvious voice signal, the number of times the symbols of two adjacent sampling points change in a unit time can be used to represent the short-term zero-crossing rate [6]. It is defined as follows: In formula (3), the frame length is L, and dsn(x) 1 is the sign function.
In real life applications, in order to prevent low-frequency interference [7], a threshold value T is usually set, and the number of times the threshold value T is crossed is used instead of the number of zero-crossings. erefore, the improved zero-crossing rate is shown in the following:

Detection Based on Entropy Function Method.
Entropy comes from statistical thermodynamics to physics [8]. It characterizes the degree of disturbance. With the cross-development of science, entropy has important applications in the fields of government, probability theory, information theory, and life sciences. In speech signal processing, the concept of entropy is widely used for endpoint detection [9], such as spectral entropy and cepstral distance entropy.
Assuming that the noisy speech signal h i (m) in the ith frame after the windowing process, after Fourier transform (FFT), the spectral probability density of the kth spectral line of each frequency component is h p (k), then the spectral entropy of each frame is defined as follows: where N is the FFT length, and the probability space of the discrete speech signal X can be expressed as follows: en, the entropy function of the speech signal X is as follows: From the definition of spectral entropy, it can be seen that speech entropy reflects the "disorder" of the amplitude distribution of the source in the frequency domain. e idea of offspring spectral entropy is as follows: first divide a frame of speech signal into several offspring, and then find the spectral entropy of each offspring. Suppose the energy probability of the mth offspring of the ith frame is P(h, i), and the offspring energy is E(h, i); then, e offspring power spectrum is as follows: e difference between the detection method based on the offspring spectrum entropy method and the original spectrum entropy method is that the latter is to find the probability density function of each power point and then the spectrum entropy, while the former is to find the power density function of a child of the speech frame [10], then calculate the spectral entropy, because, in this way, the amplitude of a single spectral line is protected from the interference of background noise.

Detection Based on Double reshold Method.
e dualthreshold method was originally proposed based on shortterm energy and short-term average zero-crossing rate [11].

Scientific Programming
Its principle is because the energy of the finals is very large, short-term energy can be used to find the finals, and the frequency of initials is also very high and can be used. Find the short-term average zero pass rate of the agreement. e dual-threshold method needs to use two levels of judgment to realize voice endpoint detection. Set Q F as the high threshold and Q H as the low threshold.
(1) First-level judgment: according to the difference in short-term average energy; (2) After the judgment of (1), according to the difference in the short-term average zero-crossing rate, the second-level judgment is carried out so that the starting point and ending point of the speech can be determined, and the effect we want to achieve by the endpoint detection can be obtained.
e specific process is as follows: (1) Starting from the silent segment, if the short-term energy [12] and the short-term zero-crossing rate of the speech frame are both greater than Q F , we can think that the frame may be the starting point of the speech, and the system starts to detect it, but it is not sure that it is the starting point. At this stage is defined as a transitional section.
(2) Continue to track the short-term energy and the short-term zero-crossing rate of the next frame. If both of these parameters are less than Q F , it is definitely not a voice endpoint, and the silent segment is restored.
(3) If one of the parameters is greater than the high threshold Q H , and the duration of such a state exceeds the minimum speech segment length, it can be marked as the starting point of entering the speech segment.
(4) After entering the speech segment, if the short-term energy or the short-term zero-crossing rate of the detected speech frame is less than Q F , it enters the end point mark, but it is still a transitional state at this time. If these two characteristic parameters are both higher than Q F , it is considered that it is still in the speech segment. If the detected two parameter values are less than Q H and last for a period of time, it is marked as the end point of the speech signal [13]. e overall flow chart of dual-threshold endpoint detection is shown in Figure 1. e above endpoint detection method is only suitable for speech in a pure environment and is not applicable when noise is added [14] because the zero-crossing rate of noise is much larger than initials and finals, so the above method needs to be improved. In fact, the short-term zero-crossing rate of the silent section in a noisy environment is less than that of the voiced section [15], which is just the opposite of the situation in a pure environment. erefore, when looking for the starting point, you only need to find the original zero-crossing rate greater than Q H . If it is changed to less than Q H , the situation of finding the end point is also the opposite, and the situation of short-term energy remains unchanged. Using the above-mentioned improved doublethreshold method to detect the starting point and ending point of a noisy segment of Chinese speech, the detection results are shown in Figures 2 and 3: Although the improved dual-threshold method has some antinoise ability [16], the false positive rate is still very high in a noisy environment. is is because the improved method cannot completely screen the noise of the speech, and the noise of different frequencies will lead to a false alarm rate. In the past ten years, in the analysis and research of voice signals, a feature value based on the entropy function has emerged as it is used in voice endpoint detection, which has a certain degree of robustness in complex environments.

e Establishment of "CIO" Model.
is paper conducts a standardized analysis by reading the literature and sorting out related concepts and theoretical foundations [17]. Comprehensively analyze the operating conditions and operating models of each enterprise and compare their comprehensive scores. e analysis results show that most of my country's SMEs' existing financial management models have serious problems, mainly: weak capital management capabilities lead to low corporate returns and poor performance. e company's business performance has led to unrealistic and unrealistic accounting information disclosure, and the high concentration of financial power allocation has led to unscientific capital management decisionmaking. e capital management [18], accounting information disclosure, and financial organization of small and medium-sized enterprises have been severely divided [19].
e current financial management model does not adapt to the new e-commerce environment and cannot survive and grow in the new economic competition environment. It is necessary to discuss how to regulate the accounting information disclosure of listed companies under the current market economy conditions, propose effective strategies to solve the problem, and expect the relevant laws and regulations of accounting information disclosure to be more complete.
Starting from the problem, this article selects three typical submodes according to the standard of financial management module-Capital Management Mode, Accounting Information Disclosure Mode, and Financial Organization Mode (Financial Organization Mode), as the research object and content, as shown in Figure 4.
is model focuses on the study of the relationship between corporate capital management, accounting information disclosure, and financial organization and aims to combine the three organically to restrict, promote and coordinate work with each other so as to achieve "capital management to promote accounting information disclosure, the virtuous circle of "promoting capital management with appropriate decentralization," thereby improving the level of corporate nancial management [20]. Accounting information disclosure means that an enterprise provides important accounting information that directly or indirectly a ects the decision-making of users to the information users in the form of public reports fairness between objects. e nancial model is to classify, organize, and link various information of the enterprise according to the main line of value creation, so as to complete the functions of analyzing, predicting, and evaluating the nancial performance of the enterprise. e following will use empirical analysis to study the interaction between these three and propose the "Capital-Information-Organization" integration model, referred to as the "CIO" model. All in all, through the reorganization of the vertical, horizontal, external, and geographical boundaries of nancial management, the purpose of nancial management has changed from the traditional management model with strict upper and lower levels, a horizontal division of left and right, and obvious internal and external boundaries to a at organizational structure, top and bottom. Communication is convenient, departments are integrated with each other, and there is no boundary between internal and external management models.

Chip Selection.
e speech recognition system is built on a certain hardware platform. Nowadays, the types of commonly used speech recognition chips are a special IC for speech recognition composed of a single-chip   Scienti c Programming 5 microcomputer (MCU) [21], a speech recognition system composed of a digital signal processor DSP, and a speech recognition system-level chip SOC (System on Circuit). e SPCE061A chip was selected as the core component of this system because it was determined after the following comparisons of these types of chips.
(1) An application speci c integrated circuit composed of multiple band-pass lters and linear adaptive circuits [22]. It is a product in the early 1980s and the rst integrated circuit dedicated to speech recognition. It consists of a set of band-pass lters to form a feature extraction circuit and then uses a linear matching circuit for pattern matching. is circuit has poor speech recognition performance.
(2) e on-chip system integrates circuits such as a single-chip microcomputer or DSP [23], A/D, D/A, RAM, ROM, host computer [24], power ampli er, and so on. As long as peripheral circuits such as power supply are added, functions such as speech recognition, speech synthesis, and speech playback can be realized. is is the most advanced speech recognition chip that has emerged in the past two years, with high performance and low power consumption, but the price is relatively high. (3) SPCE061A is a highly integrated single-chip. It integrates single-chip, A/D, D/A, RAM, and ROM on a chip. It also has 16 bit × 16 bit inner product operation and multiplication functions, and the highest CPU clock can reach 49 MHz. erefore, it is equivalent to DSP in terms of complex digital signal processing, but the price is lower than dedicated DSP chips and has a strong interrupt handling capability. e system supports 10 interrupt vectors and more than 10 interrupt sources. It is suitable for real-time voice processing. It has a dual-channel 10-bit DAC audio output function and is con gured with dynamic white gain. e microphone input method provides great convenience for voice processing. e system hardware circuit is composed of SPCE061A, SPR module, LED keyboard module, and power module. e hardware structure block diagram of the system is shown in Figure 5. Figure 5 shows the entry circuit diagram. As mentioned above, the D D converter of SPCE0611A has eight channels, one of which is the MIC-IN input, which is used to sample speci c audio signals. e noise signal is converted to M1C signal and then enters the ampli er in the gain control circuit of SPCE0611A. e options that can be detected, such as the AGC circuit, when the input signal increases at any time. e purpose of the AGC (Automatic Generation Control) automatic gain control circuit is to realize the ampli cation gain control of the detection object whose signal amplitude changes greatly. For example, in the audio design, in order to ensure that the speaker outputs the sound of the appropriate volume, of course, the volume is independent of the input (Otherwise, it is not necessary to constantly adjust the volume knob). AGC control needs to be added, and AGC can also be used in the image acquisition design to achieve stable image signal detection under strong and weak light conditions. When the input signal is reduced, the AGC circuit will automatically add Hamiltonian to an optimal level, which will help reduce corrective actions. erefore, the input circuit of the resistance and capacitance system of the DC lter is very simple as shown in Figure 6. Figure 7) shows the communication module circuit. e data in the single-chip microcomputer in the picture is converted to RS-232 level through the serial port through MAX232 level and

e Establishment of the Corpus.
Corpus refers to a large-scale electronic text library that has been scienti cally sampled and processed, which stores the language materials that have actually appeared in the actual use of the language.
ere are two types of corpora, one is keyword corpus, and the other is experimental test corpus. e keyword corpus is a total of 64 commonly used keywords in the three languages of Chinese, English, and Japanese recorded by 6 people as standard as possible. Due to gender di erences, each keyword is recorded using 1 male and 1 female; for di erent regions, pronunciation and dialects are temporarily not included in the corpus due to a large amount of data. In the future, the collected key sounds of various common dialects of speech can be cut and added to the keyword corpus in practical applications. e experimental test corpus is divided into telephone speech, actual scene recording, and reading-aloud recording. Limited to the environment, the telephone speech and actual scene recording only include Mandarin Chinese, and the reading-aloud speech includes three languages: Chinese, English, and Japanese. e phone voice uses 50 minutes of recording les of ordinary calls; the actual scene recording is the sound recorded by placing a voice recorder in the school laboratory, home, and o ce for a long time, and each of them intercepts the more concentrated voice for 1 hour. e reading speech includes news speech in 3 languages for 20 minutes each, and the reading material recorded by 3 people is 20 minutes each. Convert all voice data into 8K sampling rate and WAV format, and store them in two folders, respectively. Name the keywords with the content of the keywords, and name the test voices with the recording method and the label.

Acoustic Model.
Usually, in speech recognition, the phonemes and syllables of a certain language are commonly used as primitives for acoustic modeling. is will be a big problem for training data because su cient training data is required to obtain adequate training. Robust model. is article is aimed at di erent languages and it is di cult to fully train, so each keyword as a whole is used as a primitive. e problem with words as primitive is that the recognition process is limited by the content of the vocabulary, and words that are not in the vocabulary cannot be recognized and will reduce the recognition rate. However, the situation described in this article is that the content of the keywords that need to be recognized is known before the recognition, and the keywords to be recognized have been added to the keyword corpus, so using keywords as primitives best meets the needs. After determining the primitives, extract the average short-term amplitude, peak size, and peak spacing as time-domain feature parameters. 12-dimensional MFCC coe cients, 12-dimensional MFCC coe cients rst-order di erence, and 12-dimensional MFCC coe cients secondorder di erence are used as the frequency corresponding to the speech signal. Domain characteristic parameters: After further analysis of 3 groups of 12-dimensional parameters in the frequency domain, 27-dimensional frequency domain characteristic parameters are obtained after weighted reconstruction. Without repeated training, the time domain and frequency domain parameters of a keyword are directly used as its time domain and frequency domain acoustic models, that is, the matching template in the recognition process. After the one-step recognition is completed, it is generally necessary to con rm the keywords through the language model. When choosing a better language model, you can check whether the identi ed keywords meet the requirements from the perspective of language habits and prosodic features, which can reduce the false alarm rate of the system. However, since this article does not distinguish the language and recognizes the keywords as a whole, it does not establish a language model. Instead, it uses the time domain and frequency-domain joint search method to con rm each other, which can also better reduce the false alarm rate.

Performance Test.
For the existing corpus, a 20minute reading material recorded in the Chinese readingaloud speech was selected for the recognition test. ere are  Scienti c Programming 4 keywords: management, college, university, and education, which appear 16, 22, 39, and 12 times respectively. e sampling frequency is 8000 Hz, and the frame length is 128. e recognition results obtained are shown in Table 1. e system also separately shows the time when each keyword appears, and saves the results, so that you can easily query the results.

Selection of the Number of Transformation Matrix
Clusters. When generating the transformation matrix of the feature space, how to share training data and choose the number of clusters is a problem that requires comprehensive consideration. e degree of detail in the division of acoustic modeling units and the amount of training data will directly affect the choice of the number of clusters. In the Aurora2 experiment, since there are only 180 emission states and the noisy training data has only 1760 sentences, it is not appropriate to use too many classifications. When modeling the parameter trajectory of the feature space transformation matrix separately, we tried to use 24816 matrix classifications respectively, and the result of the recognition system is shown in Figure 8.
When the GVP-HMM method is used to cascade the transformation matrix and model parameters for modeling, the selection of the number of transformation matrix clusters needs to be reconsidered. At the training end, the transformation matrix parameter trajectory is generated on the polynomial trajectory of the Gaussian model parameters. At this time, the polynomial coefficient model is obtained through various trainings and already contains the information of the noisy speech at the training end; at the decoding end, the transformation matrix also needs to be loaded on the acoustic model. e acoustic model at this time is more consistent with the test background noise. At this time, if too many clusters are used to refine the impact of noise on each modeling unit, it will have a training effect. As can be seen from the figure, in the GVP-HMM cascade system containing variable Gaussian model and feature space transformation matrix, the one that obtains the lowest word error rate is the two-category "GVP-mv-fran" system (WER9 .79%), where the pronunciation words and silence sil and sp are divided into two classification matrices. e same result also appeared in the "GVP-mv-tran" system containing a variable Gaussian model and mean space transformation matrix. e lowest is 0 and the highest is 16.

Recognition Results of Various Parametric Trajectory
Modeling Systems. In the experiment, we used every 1001 sentences of 0 dB, 5 dB, 10 dB, and 20 dB under 4 signal-to-noise ratio conditions to decode the initial pure model, the mcond reference model, and the variable parameter system under the GVP-HMM framework.
(1) When using the GVP-HMM method to model the parameter trajectory of the Gaussian model, we chose the mean value and the mean value simultaneous parameter trajectory modeling system as a reference. In the experiment, it is simplified and marked as "mean" and "mv." (2) When modeling the trajectory of the mean transformation matrix in the model space, we selected 2 and 8 categories of mean transformation matrices "tran2″ and "tran8″, respectively. At the same time, the cascade system "mv-tran2″ with variable Gaussian model parameters is used as the reference system. (3) When modeling the trajectory of the feature space transformation matrix, we also use "ftran2″ and "ftran8″ to represent the transformation matrix of the corresponding number of categories and use "mv-ftran" to represent the cascade system. e variable parameter modeling types represented by the various systems of GVP-HMM are shown in Table 2.
At the decoding end, the signal-to-noise ratio information corresponding to the test environment is loaded onto the polynomial coefficient trajectory, and corresponding parameters are generated for decoding. e recognition results of the GVP-HMM system based on the above-mentioned traditional methods are given in Table 3.       In the third chapter of the main content of this article, the application of the GVP-HMM method extended to feature space antinoise recognition is emphatically discussed. erefore, in this experiment, the antinoise ability of the variable parameter feature transformation matrix, the variable parameter feature transformation matrix cascade Gaussian model parameter trajectory system, and the variable parameter feature transformation matrix cascade mean transformation matrix are used to verify GVP. e antinoise ability of the HMM system in the characteristic domain. e characteristic equation of the matrix is that the matrix can actually be regarded as a transformation. e left side of the equation is to change the vector x to another position; the right side is to stretch the vector x, and the stretching amount is lambda. en its meaning is obvious, expressing a characteristic of matrix A that this matrix can lengthen (or shorten) the vector x by lambda times.

Independent Modeling of Variable Parameter Feature
Transformation Matrix. On the pure model, we also use a variety of training methods, using four different signal-tonoise ratio training data to generate a transformation matrix trajectory in the form of polynomial coefficients in the feature domain instead of a static transformation matrix. At the decoding end, read in the signal-to-noise ratio information of the test environment to generate the corresponding feature transformation matrix. Compared with the reference system, the recognition results of the variable feature space transformation matrix based on the GVP-HMM method are shown in Table 4.
A series of experiments based on the Aurora 2 data set effectively supports the method of GVP-HMM extended to the feature space in this paper. It is verified through experiments that whether it is using the variable feature space transformation matrix alone, cascading with the variable Gaussian model parameter system, and cascading with the variable model domain transformation matrix, it shows good antinoise performance. At the same time, relying on its unique compact storage space and strong portability, it can become an effective means to solve robust speech recognition in noisy environments. e comparison is shown in Figure 9.

Conclusions
In today's era of big data, all departments of an enterprise must make adjustments or even reforms according to the changes in the new environment. Whether an enterprise chooses strategic, lean, or information-based finance, it all reflects multidepartment and multidimensional financial management without exception. e characteristics of the field and multidisciplinary integration. Speech recognition is an important artificial intelligence technology that has played an important role in computer applications, office automation, computer networks, and many other aspects, such as communication, robots, and intelligent humanmachine interfaces. Speech recognition technology has advanced by leaps and bounds, and effective recognition technologies such as standard matching and HMM have been formed. Some successful speech recognition systems have emerged. For now, speech recognition technology has gradually moved from the laboratory to commercialization, and for this reason, there have been many large-scale speech recognition products. In daily work and people's lives, what people need most is cheap voice recognition equipment installed on specific products. In the past two years, this small, integrated voice detector has become an important direction for technological development. After a lot of research, this article uses a combination of regulatory analysis and empirical analysis to analyze the current situation of financial management of my country's small and mediumsized enterprises; my country's business from three aspects: financial organization and management, accounting, and capital information disclosure under the new situation of e-commerce, finance. e problems and deficiencies of the management model have established a new model of e-commerce-based financial management for small and medium-sized enterprises-CIO environment interface model and conducted a basic analysis of structural equations and structural equations.

Data Availability
No data were used to support this study.

Conflicts of Interest
ere are no potential conflicts of interest in this study.