Multisource Heterogeneous Data-Based Behavior Excavation of Investors and Financial Risk Prediction

­e investment decision-making behavior of investors refers to the capital purchase behavior of investors in order to achieve a certain purpose or goal. As an individual decision-maker, investors will be aected by various factors when making investment decisions, and the inuencing factors may be caused by the outside or generated by themselves.­e focus of this paper is to analyze the impact of investors’ past returns on investors’ decision-making based on multisource heterogeneous data. Multisource heterogeneous data can show the results of investors’ behavior evaluation at multiscale and multilevel. Using conceptual deduction and empirical research methods, based on the analysis of investors’ behavior, the nancial risk prediction results are proposed. Financial risk prediction mainly analyzes the impact of investment returns on investors’ decision-making from two aspects: individual returns and overall returns. ­e measurement indicators of investment decisions use investors’ transaction frequency and asset size.


Introduction
With the continuous development of China's economy, people's personal property showed a trend of increasing, and investment is becoming more and more common and gradually becomes a necessity close to daily life. Faced with so many types of investments and various types of risk assets, it is a crucial step for investors to make scienti c investment decisions. We know that people are often in uenced by various factors when making decisions, such as the in uence of people around them, income and expenditure, the current popularity of investment targets, and the personal characteristics of investors (including the income of investors' previous investments and the cognitive level of investors). e in uence of investors' past investment experience (the main research focus of this paper on investment experience is the past investment income) will have an impact on investors' preferences. e past investment income of investors includes the income of investors from investing in stocks, bonds, gold, real estate, bank nancing, trust plans, and other products. e experience of investment income has a great impact on investors. Especially in the face of so many investment varieties, for ordinary investors, they are usually more inclined to pay attention to and invest in the investment products that have bene ted in a certain aspect. However, once the investment has led to huge or continuous losses, investors will be discouraged and may choose to withdraw from this investment variety. Based on the income gap of investment varieties, the author will analyze and elaborate on the impact of investors' past earnings on investors' preferences. Multisource heterogeneous simply means that a whole is composed of components from many di erent sources, including mixed data and discrete data.
e Internet is a typical heterogeneous network, and the fusion propagation matrix is a typical multisource heterogeneous data network.
By studying the relevant data of investors' investment returns in the securities market, this paper analyzes the impact of investors' past investment returns on their investment preference behavior (the preference measurement standard includes the number of assets invested and the trading frequency of investors). e author predicts that the investment returns are positively correlated with the number of assets increased, and the trading frequency is also positively correlated with the number of assets increased. At the same time, the daily average income, daily trading amount, and daily trading volume data of the Shanghai Stock Exchange Index are retrieved through Wind, and the VAR model analysis of the data is carried out by using EViews software. e normative definition of VAR is that, under normal market conditions and a given confidence level, the maximum loss that a portfolio is expected to incur in a given holding period, or that, under normal market conditions and a given period of time, the probability of VAR loss of the portfolio is only a given probability level.

Related Work
Since behavioral finance scholars regard investors as different people with incomplete rationality, and their decision-making will be affected by their own psychological characteristics, habits, and cognition, their emotions naturally become the focus of observation and research.
Although the investors in the financial market differ in psychology, judgment, and decision-making, under the influence of the market and even the whole society, their emotions and behaviors can often achieve amazing consistency in some time periods and within a certain range, resulting in their influence not being ignored as in modern finance, but must be paid attention to and have the opportunity to surpass other internal factors. e research shows that during the bull-bear alternation in the stock market operation, at the peak of the bull market or the lowest valley of the bear market, the emotions and behaviors of many investors in the market tend to be consistent; that is, they tend to be more bullish on the market (the peak of the bull market) or bearish (the lowest valley of the bear market). At this time, the basic quality of the company that the stock should represent is reduced to a secondary factor. Some professional scholars have also calculated that the impact of investor sentiment on the stock price in the above period can account for 60%, which has exceeded the sum of all other factors [1].
In China, the scholar believes that emotion management should have at least two aspects [2]. First of all, we should understand our own emotions; that is, our current emotions can be subjectively reminded, continuously tracked, and globally controlled. en, we need to understand other people's emotional abilities; that is to say, we should be able to understand other people's inner feelings from their situations, and then we should have the ability to deal with other people's emotions. Acharjya and Natarajan [3] believe that emotion management is our own rational control of emotions. We can try to keep a positive attitude in various ways. Based on the above research, this paper defines investor sentiment management as identifying investor sentiment and taking effective methods or approaches to correct the deviation in investor sentiment.
With the continuous development of the financial market, more and more people are investing in finance, and it has become more and more common for everyone to invest. Some previous investment theories can no longer explain some strange phenomena in the market. For example, many investors always like to buy stocks that have risen more in the short term, and the performance of such stocks is often not very good. ere are some phenomena worth noting. For example, the share price of large-cap blue chips is low or even below the market net assets, while the share price of small and medium-sized boards is extremely high, and some of them have a PE ratio of more than 100 times or even hundreds of times. How to explain the phenomenon of people trading on the spot more frequently? And why are we familiar with the phenomenon of herding? Liu and Li [4] were unable to use modern financial theory to explain various strange phenomena, and the previously used and familiar capital asset pricing model and rational person hypothesis could not give a reasonable explanation. Some financial experts and researchers have systematically analyzed the impact of the market on investors [5,6]. rough observation and research, it is found that people's investment decisions are often irrational. Investors are affected by many factors when making investment decisions, such as the level of emotion, the size of age, the comments of surrounding people, and even the weather. ese will have an impact on investors' decisions and even affect the development of the market.
Traditionally, the methods of financial risk prediction are variance method or coefficient method, but because these two methods can neither reflect the direction of income deviation from the mean nor accurately reflect the exact size of the loss, people have long hoped to find a new method to measure downside risk, that is, to focus on the left side of the probability distribution of return. At present, there are still many defects and deficiencies in the development of behavioral finance. As a different branch of standard finance, behavioral finance tries to explain some phenomena that cannot be explained by standard theory in the modern financial market, hoping to establish a complete set of systematic new theories. Behavioral finance is closely combined with the actual market and hopes to be tested in practice.
us, the status of behavioral finance has been further consolidated. Many domestic researchers have made profound research on behavioral finance and analyzed it in combination with the characteristics of the financial market of China [7][8][9][10][11]. Combined with the characteristics of the securities market of China, especially compared with foreign mature markets, the fact is that China's securities are still in the development stage and there is a certain development gap between them. ere have been many phenomena that cannot be explained by standard finance, which is both an opportunity and a challenge for academic researchers of China.

Analysis Framework Based on Multisource Heterogeneous
Data. In order to more systematically study the prediction of investors' behavior and financial risk based on multisource heterogeneous data, firstly, the factors that can affect investors' investment behavior are analyzed, and the information is fused and sorted out. Multisource heterogeneous data analysis establishes the relationship between shareholders' behavior and financial risk by integrating information and data resources, integrating modern information technology, and aggregating market behavior. e specific process is shown in Figure 1 [1]. e analysis methods of multisource heterogeneous data are summarized, and a financial risk prediction framework based on multisource heterogeneous data is proposed, as shown in Figure 2. e technical steps, task formalization, and evaluation indicators of the framework are analyzed in detail [12]. e model is established based on the results of multisource heterogeneous data analysis. is paper constructs a VAR model to predict financial risks, as shown in Figure 3 [13]. All the analysis data in this paper are based on the statistical integration of market research questionnaires, and the main research objects are relatively active shareholders in China's stock market.
In the technical framework of this paper, the acquisition and processing of basic data can be regarded as a supervised binary classification task. e purpose of the financial risk prediction task is to calculate and analyze the input data from multiple sources related to the target stock and financial risk by constructing an appropriate VAR model, so as to obtain the rise and fall probability of the stock at the next time point [14]. e key to accomplishing this task is to build and train a deep learning model with good prediction results. e related concepts are introduced below to better understand the task of financial risk prediction.
(1) Data input: where x i j (t), j � 1, 2, . . .., m represents the characteristic value related to stock i in the data source at the time node t. e data source can be from the historical price and trading data of the stock, or from the investors' comments on the stock after data representation.
(2) Result output: where y i (t) represents the probability that the closing price of stock i will rise or fall at the next time node t + 1 and F(·) is the specific calculation process of the neural network, which is related to the structure of the neural network.
where l i (t) is the rise and fall of the stock i price at the next time node t + 1.
If the closing price of the stock at time node t + 1 is higher than the closing price at time node t, the sample is marked as [0, 1]; otherwise it is marked as [1,0]: (4) Model objective function: where N represents the number of samples in the training set and t represents the number of data in the time node. For the probability distribution between the predicted value and the actual value calculated by the model, the min loss function is used as the loss function. (5) Model evaluation indicators: e evaluation indicators for model prediction generally include accuracy (ACC), precision, recall, F 1 value, and Matthews correlation coefficient (MCC), which are defined as recall 3.2. Data Acquisition. We found that it is subjective to investigate the impact of different investment returns on investors in the form of questionnaires. Investors often have a clear mind when filling in the investment questionnaires, and investors are one-sided when filling in. erefore, the questionnaire information is one-sided and subjective for this paper. Moreover, the respondents may have the following problems when filling in the questionnaire: (1) they have not made investment or are not interested in investment; (2) emotional instability when filling in the questionnaire, which may be subjective; (3) the questionnaire filled in by the respondents cannot well reflect the mentality of investors when making decisions, and the environment is different [15]. erefore, the validity of the data obtained is poor, even showing completely opposite results with investors when making decisions. It is a good choice to select the trading situation of investors as a research example. erefore, this paper selects the trading records of investors in the securities market as the research object. First of all, the trading records of investors are objective, and the income situation of investors is objective. Moreover, Shanghai Stock Exchange stipulates that an ID card can only open a personal account, so it is unique; we draw a conclusion by analyzing the relationship between the investor's transaction and  income. Secondly, the investors' income discrimination is obvious and the stock is easy to realize when they invest in securities. Investors often have a certain degree of continuity when they invest in securities, which is conducive to research. At the same time, the investors' trading situation is also easy to obtain. e changes in the other two groups of data are conducive to studying the impact of investment income on decision-making. When analyzing the overall sample, the data of the overall sample are collected through Wind software, including the Shanghai Stock Exchange Index return, trading frequency, trading volume, and other data information.

Data Processing and Filtering.
We collected data from January 1, 2008, to December 31, 2013, in total. In order to highlight the effectiveness of the data, we made a node every half a year. For example, from January 1, 2008, to June 30, 2008, we counted the investor's income, trading frequency (the total transaction volume of the investor during this period divided by the daily average asset value of the investor within half a year), and the daily average asset amount of the investor. e selection of half a year as a node mainly has the following considerations: (1) investors may have short-term emotional fluctuations during this period to affect investors' decisionmaking; (2) investors have gambling psychology in a certain investment; (3) investors made a risky investment decision when they heard the news, and half a year as a node plays a certain role in eliminating the impact in this regard [16].
In screening the data, we only select the data from the second half of 2008 to the first half of 2013 as the research object. Due to special reasons, when selecting the data, we found that investors are increasing every year. erefore, we took the first half of 2008 as the node and selected the investors who entered the market before January 1, 2008, as the research object. After 2008, we excluded a total of 19027 investors' data. In addition, consider that we need to analyze the customer information of transactions during this period, the information of investors whose total transaction amount within a certain range is extracted. erefore, investors whose transaction amount is lower than a certain amount are excluded. e selected data time span is the income from the second half of 2008 to the second half of 2012, and the transaction frequency and daily average asset information from January 1, 2009, to June 30, 2013. From the second half of 2008 to the second half of 2012, there were a total of 9 sets of data. It is observed that there are some maxima in the return rate and transaction frequency in the data, and the data with obvious errors are eliminated. After analysis, the wrong data may be caused by systematic statistical errors. After processing, the valid data of 6999 investors' questionnaire information are finally selected. e final valid data selected after processing are 6999 investors' questionnaire information. e following is the calculation formula of the obtained data [17]: where CIA is the changes in assets; CA is the closing assets; and OA is the opening assets.
where TPL is the trading profit and loss; NDW is the net deposit and withdrawal; TIMV is the opening assets; TTMV is the transferred to market value; and TOMV is the transfer out market value. Investors' transfer in and out includes not only capital but also stock transfer. is part is very few, and the impact on this study is ignored.
where Y is the yield and ADA is the average daily assets. Average daily assets are the sum of daily assets divided by days.
When analyzing the impact of overall Shanghai Stock Index return and investment decision (the indicators to measure investment decision are transaction amount and transaction volume), in order to make the coverage of the analysis sample more obvious, we selected the data from May 5, 2008, to November 27, 2009, because we found that the trend chart of Shanghai Stock Index in this period covered a relatively complete historical cycle from a bear market to bull market by observing the trend chart of Shanghai Stock.

Result Analysis and Discussion
rough the analysis of previous studies, it is found that VAR model can be used to study the relationship between stock market returns and investor sentiment when studying investor sentiment and stock market returns in China, and Wang Yu concluded that the positive impact of sentiment index and returns with a lag period is significant. Khalfaoui et al. [18] made an empirical study on the relationship between investor sentiment and A-share returns by taking the trading frequency, closed-end fund discount, IPO return, PE ratio, and PB ratio as parameter indicators in the VAR model. Nikou et al. [19] used the vector autoregressive model and the GARCH model to conduct empirical research on stock market returns. Due to the particularity of the data taken in this paper, different from the previous research types, their methods were not used for verification. e author found a new way to analyze these data through analysis. e following are some data processing and analysis statements in this paper.

Individual Sample Analysis of the Impact of Yield on
Investment Decisions. We analyze and process the data through MATLAB, and the statistical results are shown in Table 1. Figure 4 shows that the first four comparisons in the second case were large, which was in line with the special situation during the two years from 2008 to 2009. Due to the relatively large volatility of the securities market, the Shanghai Stock Index fell from more than 5000 points in early 2008 to 1660 points in 2009, and then rose to 3500 points in 2009. e fluctuation amplitude reached as much as 200%. Figure 5 shows that a large amplitude led to a Mathematical Problems in Engineering certain impact on investors' decision-making. Investors' emotions in the current period are quite tense, so the transaction frequency in the current period has also increased a lot. As the period selected in this paper is a halfyear statistical standard, it is delayed, which has a certain impact on the results. e reason for the negative correlation may be that the recent volatility of the stock market is relatively large, investors' enthusiasm for participation is high, and their prudence is improved. erefore, investors are more sensitive to the rate of return. As a result, investors are always eyeing the market and trading frequently, and there is a situation of fast in and fast out of funds. As the data in this paper are taken half a year as the node, the second situation will occur in the case of severe short-term stocks. In fact, the second situation can be analyzed in the overall sample, because in the short term, investors will not misappropriate their funds to other places, but will always keep an eye on the market, fast in and fast out. In this case, investors prefer to do short-term operations. During the analysis, we found that if the trading days are taken as the time interval, the returns of the last few trading days also show a positive correlation with the trading frequency.
According to Table 2, it is clear when observing the impact of investors' past return on daily average assets. According to the statistics, the sum of the results of the first case and the third case reaches 75%, and the return obviously has a positive correlation impact on daily average assets. Figure 6 shows that the results of previous logical judgments, it is found that when the investor's past return rate increases, the investor's daily average assets in the next period will also increase correspondingly, and the increase is greater than the investor's return rate multiplied by the amount of the previous daily average assets, which indicates that the investor's past return has an impact on the investor's decisionmaking behavior, and has a positive impact on the investor's investment amount.
is is consistent with the previous theory of investors' psychological decision-making. When the market is good, investors will increase their investment funds. Similarly, when the market is bad, the income will decline, and investors will withdraw some or even all of their funds from the securities market for the time being.

Overall Research on the Impact of Yield Rate on Financial
Risk. e example listed above in this paper is the statistics of individual sample data taken from securities companies, that is, the investment situation of a single individual. In Table 1: Statistics of past earnings and transaction frequency.

Number
Case 1  Case 2  Case 3  1  1359  4620  1020  2  1831  4280  888  3  2713  3271  1015  4  1414  4220  1365  5  2818  2115  2066  6  2543  1785  2671  7  1844  1892  3263  8  2286  1497  3216  9 16808 23680 15504    2  Case 3  1  5878  863  258  2  5729  997  273  3  3928  2631  440  4  4377  2178  444  5  4741  1693  565  6  4355  1957  687  7  3339  2790  870  8  4043  1995  961  9  36390  15104  4498 order to find out the relationship between past returns and investment decisions more rigorously, the relationship between the overall stock return of the market and investment decisions is analyzed. e yield can directly affect investors' investment decisions, but the prediction of financial risks is indirect because the investment decision of investors is the main reason that affects the whole financial risk. First, the sample data are obtained. e overall rate of return uses the trend of Shanghai Stock Exchange Index to match the trend of return, while the description of investment decision-making uses the daily trading volume and daily transaction amount for analysis. In order to obtain a complete stock index trend graph, we specially take the complete trend from May 5, 2008, to November 27, 2009, (because the trading day here is taken as the analysis interval, and the trend of Shanghai Stock Exchange Index during this period is relatively complete, which has experienced bear bull market and stock market). Export the daily rise and fall, daily trading amount, and daily trading volume data of Shanghai Stock Index through Wind. e empirical test of VAR model is carried out on the return, trading volume, and trading amount of Shanghai Stock Exchange Index. e empirical results in Table 3 can be obtained by importing the data into EViews 6.0 software. e characteristic root of the VAR model is fund. Figure 7 shows the distribution of the characteristic root of the VAR model. Figure 7 shows that the unit root of the VAR model is all within the unit circle, which indicates that the data model is very stable. e characteristic root distribution map is based on a circle with a radius of 1. e relationship between the data in this circle has a certain correlation. e fact that the characteristic root can fall in this circle proves that there is a strong correlation between the data, and indicates that the data are stable at the same time. Figure 8 shows that the indicators selected in this paper are the actual transaction and income of investment, excluding the influence of subjective factors. erefore, the characteristics of this paper are supported by rare actual data. In addition, this paper uses two samples to support this study. e individual sample example uses the method of studying the transaction records of 6999 individual investors to study the impact of individual investors' past investment returns on investment decisions. e study of the Shanghai Stock Exchange Index (000001. SH) is considered from the level of the overall sample. e analysis of long-term and short-term impacts from a single sample to a total sample is conducive to improving the credibility of the study. Figure 9 shows although it is the actual data of customers, the author can only consider the investment decision-making of investors in the securities market and cannot obtain the investment of investors in other investment markets. erefore, it has certain limitations. At the same time, it cannot eliminate the factors that individual investment is the transfer of funds from the stock market due to an urgent lack of funds or other reasons. erefore, the research method has certain defects. In addition, since the system launch time of the obtained data is the end of 2007, the data obtained by the author can only be the data after 2008, and the time span of the obtained data is not large enough. In particular, it is a pity that the obtained data did not obtain the three-year data from 2006 to 2008, which has a relatively obvious span of bull and bear markets. In the process of data processing, due to the limited personal ability and the characteristics of the data taken, when analyzing the individual sample data, this paper cannot use a general model to analyze the data and draw a conclusion. Here, we can only use the method of sorting and classifying the data by counting the data of super large standard to test the characteristics of the data and use the method of probability statistics to analyze the data.
A healthy securities market should function as a barometer of the economy and develop steadily and healthily. It should not be a place for investors to play games. e market shows a straight up and down trend, and investors are more passive in the face of such a market. At present, China's economy is still in the stage of sustainable development, and the overall volume is relatively large, but it is still in the rising stage of development. At the same time, the regulatory level should formulate policies suitable for market development in order to keep up with the development needs of the economy and capital market, and standardize the investment market, so that the capital market, especially the securities market, can serve the real economy well, rather than becoming an ATM for a few people and an investment graveyard for the vast majority of ordinary investors. A healthy capital market should benefit the country and the people. Only by mastering the trend of the securities market, analyzing the psychology of investors, and formulating more market-oriented rules can the securities market develop healthily for a long time. In addition, the regulatory authorities need to do a good job in supervision and management, strengthen investors' awareness of investment risk to protect the interests of investors, and strengthen the management of listed companies [20]. e coping strategies for financial risks are mainly reflected in the psychological adjustment of investors themselves. In addition, the government should do a good job in the supervision of the financial market and the backing of the majority of shareholders.

Conclusions
is paper analyzes the characteristics of investors' behavioral psychology based on multisource heterogeneous data, and then studies behavioral finance combined with the characteristics of human behavioral psychology, analyzes the characteristics of behavioral financiers, studies the characteristics of investment decision-making behavior and financial risk prediction, and finally obtains the following conclusions through empirical analysis: (1) According to the analysis results of individual samples and overall samples, the impact of investors' past earnings on investors' decision-making is that investors' past earnings are positive on investors' investment amount, and the more recent earnings (especially in the last one or two trading days) have a greater impact on investment decision-making, which is consistent with the characteristics of the securities market of China and investors' investment behavior. Investors are more active in short-term arbitrage, which is different from the low trading frequency in foreign securities markets. is is because individual investors in the securities market of     Mathematical Problems in Engineering China are more active and account for a large proportion. (2) is paper analyzes the behavior of individual investors and market groups and the characteristics of long-term investment behavior and short-term investment. It is concluded that investors' past returns, whether short term or long term, have a positive impact on investment decisions. is paper studies the data analysis of individual samples and overall samples, as well as the analysis of sub-half a year and the continuous research of Shanghai Stock Exchange on each trading day, and makes a more comprehensive data analysis with examples. (3) When studying investors' decision-making behavior, this paper finds that the stock market return is the Granger reason for investors' decision-making trend through examples, which is similar to the conclusion that many domestic scholars have a positive impact on investors' sentiment through research. rough empirical research, the prediction of financial risk is positively correlated. at is, the analysis can be enlarged to the whole investment market.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e author declares that there areno conflicts of interest.