Small and micro enterprises play a very important role in economic growth, technological innovation, employment and social stability etc. Due to the lack of credible financial statements and reliable business records of small and micro enterprises, they are facing financing difficulties, which has become an important factor hindering the development of small and micro enterprises. Therefore, a credit risk measurement model based on the integrated algorithm of improved GSO (Glowworm Swarm Optimization) and ELM (Extreme Learning Machine) is proposed in this paper. First of all, according to the growth and development characteristics of small and micro enterprises in the big data environment, the formation mechanism of credit risk of small and micro enterprises is analyzed from the perspective of granularity scaling, cross-border association and global view driven by big data, and the index system of credit comprehensive measurement is established by summarizing and analyzing the factors that affect the credit evaluation index. Secondly, a new algorithm based on the parallel integration of the good point set adaptive glowworm swarm optimization algorithm and the Extreme learning machine is built. Finally, the integrated algorithm based on improved GSO and ELM is applied to the credit risk measurement modeling of small and micro enterprises, and some sample data of small and micro enterprises in China are collected, and simulation experiments are carried out with the help of MATLAB software tools. The experimental results show that the model is effective, feasible, and accurate. The research results of this paper provide a reference for solving the credit risk measurement problem of small and micro enterprises and also lay a solid foundation for the theoretical research of credit risk management.
In recent years, China’s economy has maintained a good momentum of development. The number of domestic enterprises has grown steadily, especially small and micro enterprises, which have become a large number of dynamic enterprise groups in the main body of market economy. So small and micro enterprises are an important part of China’s economy. However, small and micro enterprises also face severe financing difficulties in the process of survival and development. Because of their own reasons and political, economic, legal, and other external factors, they are in a dilemma of few financing channels and high financing costs. From a macro perspective, China’s market economy system is still not perfect, the credit system is defective, small and micro enterprise groups cannot get a comprehensive and objective credit evaluation, and the financial investment industry has not paid enough attention to it, resulting in narrow financing channels. From the micro point of view, small and micro enterprises have light assets, small scale, and unknown financial situation, so it is difficult to reasonably assess the credit risk. Also, the private lending interest rate and cost are high, which makes financing a difficult problem.
The measurement of credit risk of small and micro enterprises is a research hotspot of scholars at home and abroad. In China, Man et al. [
In conclusion, the current research states of credit risk measurement methods of small and micro enterprises are discussed from different perspectives of constructing credit evaluation index system and risk measurement model. Among them, foreign scholars emphasize on quantitative research by mining the credit data of some small- and medium-sized enterprises for research and evaluation. Domestic scholars pay more attention to qualitative research on the construction of indicator system and quantitative research on credit risk measurement. In addition, big data provides more comprehensive, accurate, and precise digital management for many subject areas. Chen and Wu [
First of all, the characteristics of small and micro enterprises and the impact of big data on small and micro enterprises’ credit evaluation are analyzed in depth from the perspective of granularity scaling, cross-border association, and global view driven by big data, and the mechanism of small and micro enterprises’ credit risk formation is explored. On this basis, credit risk measurement indicators are selected to establish a credit risk measurement indicator system. Secondly, due to the non-linear inherent relationship of credit risk data, adaptive learning characteristics of neural network have become the most common and relatively accurate classifier in credit risk measurement [
Therefore, ELM feedforward neural network is employed to solve the problem of credit risk measurement. Because the number of hidden layer nodes has a great influence on the classification accuracy, the initial weight, threshold, and hidden layer node parameters of ELM are optimized by the improved GSO algorithm. A parallel integrated learning algorithm based on the improved GSO algorithm and ELM neural network is applied to the credit risk measurement of small and micro enterprises for solving the problem of credit risk measurement of small and micro enterprises.
The research in this paper mainly includes 3 parts. Section
Metaheuristic algorithm is an improvement of heuristic algorithm, which is the combination of random algorithm and local search algorithm. Traditional metaheuristic algorithms include tabu search algorithm, simulated annealing algorithm, genetic algorithm, ant colony optimization algorithm, particle swarm optimization algorithm, artificial fish swarm algorithm, artificial bee colony algorithm, artificial neural network algorithm, glowworm swarm optimization algorithm, etc. At present, there are several new metaheuristic algorithms, such as Henry gas solubility (HGS) [
GSO algorithm [
The main idea of GSO algorithm is that the glowworm in search space represents every feasible solution of the optimization problem. Each glowworm has its own fluorescein and sensing radius. Its brightness is related to the target value of its location. The glowworm with higher brightness has better index value. In the iterative process, the brighter glowworm has stronger attraction ability, which attracts other glowworms to move towards it. Since each glowworm has its own decision radius, the decision radius will be affected by neighboring glowworms at the same time. When the number of glowworms around it is few, the decision radius of glowworms will increase, which can attract more glowworms around. When there are more glowworms around, the decision radius will be smaller. At last, most of the glowworms will gather in several positions with better objective function value to reach the best value. GSO algorithm is described by the following mathematical formulas:
The GSO algorithm is described as follows: Step 1: Initializing the relevant parameters of the algorithm, including population size, iteration times, and other parameters, to be set. Step 2: The objective function value Step 3: Within the radius of its dynamic decision domain Step 4: Selecting the object to move and updating the position Step 5: Updating the dynamic decision domain radius Step 6: Judging whether the algorithm reaches the maximum iterations or not. If not, turn to step 2; otherwise, end.
Due to the uneven distribution of the initial solution in the solution space, the algorithm is unstable, slow in convergence speed, and low in accuracy. In order to avoid the premature problem of GSO algorithm, the idea of Good Point Set (GPS) Theory is employed to generate the initial glowworm population with uniform distribution. At the same time, a new inertia weight function of glowworm moving is used to dynamically update the moving step length, i.e., adaptive step-size, so as to further improve the stability, convergence speed, and accuracy of the GSO algorithm.
The theory of good point set [
Let
According to number and dimension of the sample
Set
It can be seen from [
Because of using the theory of good point set to construct the initial population of glowworm, its calculation accuracy is independent of the dimension. So using the method based on good point set theory to design the initial glowworm population uniformly can overcome the shortcomings of the traditional methods and can produce the initial population with better diversity.
The number of initial glowworm population is Generating good point set by exponential sequence method: Generating good point set by square root sequence method: Generating good point set by circle Division method:
Using the good point set method to design the initial glowworm population uniformly can produce the initial population with better diversity. Figures
Initial population generated by the random method.
Initial population generated by exponential sequence method.
Initial population generated by square root sequence method.
Initial population generated by circle Division method.
In the GSO algorithm, each glowworm has a different search range determined by the sensing radius. GSO algorithm can find the global or local optimal solution, which depends on whether the individual can move within the sensing range. With the increase of the number of iterations, glowworm individuals tend to converge near the peak. At this time, if the distance between glowworm individuals and the peak is less than the moving step, the individual will move to the other side of the peak. If the iteration is repeated again, the glowworm individual will move to the other side of the peak. The individual still fails to reach the optimal peak at this time. The glowworm individual repeatedly moves around the peak, which is called oscillation phenomena. To solve this problem, it is necessary to adjust the step size dynamically according to the search results of different stages, so as to deal with the relationship among the global optimization ability, convergence speed, and optimization accuracy. Based on the idea of inertial weight of particle velocity in particle swarm optimization algorithm [
The inertia weight function of glowworm moving step is shown in Figure
Inertia weight function of glowworm moving step.
The traditional single hidden layer feedforward neural network model is mainly based on the algorithm of gradient descent, such as BP neural network algorithm. Its learning speed is difficult to meet the needs, and it is easy to lead to local optimal solution. In different application scenarios, parameters need to be adjusted.
In 2004, Huang et al. [
Suppose the number of neurons in the hidden layer of ELM is
The learning goal of single hidden layer neural network is to minimize the output error, which can be expressed as
That is, there are
It can also be expressed as
In order to train single hidden layer neural network, we hope to get
The traditional learning algorithm based gradient needs to adjust all parameters in the process of iteration. In the ELM algorithm, once the input weight Input weight and hidden layer threshold are given randomly According to the input of training data and the activation function of hidden layer, the output matrix According to the formula (
Because the construction of credit risk measurement indicators of small and micro enterprises has three characteristics of big data problem [
Refering to the credit indicator system used by Moody’s Investors Service and standard & Poor’s corporation, combined with the development characteristics of small and micro enterprises driven by big data, we pay full attention to the impact of the social platform and e-commerce platform data of enterprise owners on the credit status of small and micro enterprises. Credit status of enterprise operators, enterprise innovation ability, enterprise competitiveness, and staff quality are highlighted, so as to reasonably reflect the credit level of small and micro enterprises. 7 first-level indicators and 22 second-level indicators are selected. SPSS software is used for factor analysis to quantify the correlation between indicators, and principal component analysis is used to extract factors from indicators. Finally, 7 primary indicators and 10 secondary indicators are selected as credit risk measurement indicator systems, which is shown as Table
Description of index system.
Index category | Specific index |
---|---|
Enterprise credit | |
Development capacity | |
Risk level | |
Profitability | |
Quality of enterprise owner | |
Operating capacity | |
Innovation ability |
ELM can randomly initial the connection weights and hidden threshold between input layer and hidden layer before training. ELM does not need iterative learning many times and can directly calculate the least square solution of output weight matrix. Although the learning speed is fast and the parameter adjustment is simple, the robustness of the model will be greatly affected when there is noise or uneven distribution in the training data set [
The idea of IGSO-ELM integrated learning algorithm is to determine the structure of ELM according to the input and output parameters, so as to determine the coding length of each individual glowworm. Each individual in the population contains the initial weights and thresholds value of ELM. That is to say, the initial weights and thresholds of ELM are obtained by decoding the glowworm individuals in the IGSO algorithm; then the IGSO algorithm is used to optimize the weights and thresholds of ELM. Thus, a parallel and interactive learning algorithm of IGSO and ELM method is built. Finally, the optimal ELM weights and thresholds are obtained. The flow chart of the IGSO-ELM is shown in Figure
The flow chart of the IGSO-ELM.
The number of ELM input nodes is determined by the credit risk measurement index of small and micro enterprises. The number of hidden layer nodes is decided by the number of samples. The Output node indicates whether the credit record of small and micro enterprises is in default. The IGSO-ELM algorithm is implemented as follows: Step 1. Encoding: The parameters of ELM, such as weight Step 2. Initialing parameters: Set the size of glowworm population Step 3. Calculating glowworm fitness: Decode the glowworm, generate the weight and threshold of ELM, train and test the network to get the network test error (MAE), employ the error indices as the fitness of each glowworm, and update the fluorescein value of each glowworm by formula ( Step 4. Updating glowworm locations: Calculate the neighborhood set by formula ( Step 5. Updating Decision radius: Update the sensing radius Step 6. Judging: Judge whether output accuracy of ELM meet the end conditions or not; if it is achieved, the optimistic results are given to the ELM network to produce the output result, iteration ends; otherwise, judge whether reaches the maximum iterations or not, if not, turn to Step 3. Otherwise, end.
The performance evaluation criterion of the model is an indispensable part of the measure model. The proper estimation of the measure model can evaluate the accuracy of different models, which allow different models to compare with each other and also be used to define warning threshold [
All the codes in this experiment are written on MATLAB r2013a software platform. The compiled PC parameters are Intel (R) core (TM) i7-7200U CPU 2.71 GHz, 8.00 GB memory, 64-bit Windows10 operating system.
In order to verify the effectiveness of the IGSO algorithm based on good point sets theory and adaptive step length strategy, the following 10 benchmarks standard functions are selected for testing, as listed in Table
Description of the 10 benchmark functions.
Name | Expression |
---|---|
Bohachecsky | |
Rosonbrock | |
Matyas | |
Booth | |
Ackley | |
Rastrigrin | |
Dejong | |
Zakharov | |
Griewank | |
Sphere |
The parameters of the two GSO algorithms are setting as follows: the maximum iteration
The initial perception domain and the dynamic decision domain for the test function
In Table In terms of calculation accuracy, in 30 times repeated experiments, Table In terms of convergence speed, it can be seen from Figure In terms of stability, from the variance value in Table
Results of 30 times independent experiments.
Function | Algorithm | Best | Worst | Mean | Var |
---|---|---|---|---|---|
GSO | 5.714551 | 1.357698 | 2.663643 | 8.017839 | |
IGSO | 2.502637 | 1.234684 | 2.009736 | 9.760807 | |
GSO | 1.420992 | 3.564882 | 7.118286 | 7.676224 | |
IGSO | 7.677080 | 2.221346 | 2.191989 | 1.864490 | |
GSO | 1.222893 | 5.271113 | 1.113159 | 2.846101 | |
IGSO | 1.198294 | 1.188994 | 1.263033 | 1.971471 | |
GSO | 2.260599 | 1.572854 | 4.585418 | 1.992108 | |
IGSO | 1.717953 | 1.958495 | 2.995911 | 2.124073 | |
GSO | 1.765632 | 5.699369 | 1.609170 | 2.202052 | |
IGSO | 1.772661 | 1.295454 | 1.700449 | 1.248972 | |
GSO | 1.369385 | 8.083180 | 1.081428 | 2.499092 | |
IGSO | 1.291843 | 5.871926 | 9.933543 | 2.571551 | |
GSO | 2.567627 | 3.065898 | 2.598793 | 4.499324 | |
IGSO | 1.019086 | 1.821743 | 1.283270 | 1.182948 | |
GSO | 1.132974 | 8.152062 | 6.449133 | 2.696393 | |
IGSO | 7.456504 | 1.612441 | 2.003430 | 1.531690 | |
GSO | 3.492894 | 1.771344 | 8.484216 | 9.609253 | |
IGSO | 3.345529 | 2.298949 | 1.376466 | 2.255257 | |
GSO | 3.418968 | 1.703526 | 8.689655 | 8.823356 | |
IGSO | 1.000030 | 1.874430 | 2.203945 | 5.477851 |
Convergence curves of the tested benchmark functions.
Take function
In a word, compared with original the GSO algorithm, the IGSO algorithm has a great improvement in convergence speed and calculation accuracy, and the IGSO algorithm has some advantages in stability compared with GSO algorithm.
The credit data of small and micro enterprises in China are employed to test IGSO-ELM algorithm model. The experiment data span is from 2017 to 2018, which includes the following types of information: (1) personal information of legal person small and micro enterprises; (2) economic and financial ratio data of small and micro enterprises; and (3) current credit data of small and micro enterprises. Personal information of enterprise owners and financial and economic data are from CSMAR Taian Financial Research Database, and enterprise credit data are obtained from sesame credit business service platform. A total of 549 samples of small and micro enterprises have been obtained and processed. According to the definition of financial institutions, a loan is considered to be in default if it is overdue for more than 15 days. Among them, 308 small and micro enterprises did not have loan default, accounting for 56.11%, and the remaining 214 small and micro enterprises have different degrees of loan default, accounting for 43.89%. In order to effectively compare the classification models (BPNN, ELM, ABC-ELM, PSO-ELM, GSO-ELM, and IGSO-ELM), the data set is randomly divided into two disjoint subsets, of which 75% are training subsets and 25% are testing subsets. 10 cross tests are used for each model. The advantage of cross testing is that the credit model can contain the available data (75% of the samples) to the maximum extent.
According to the research results in Section
The parameters of model related to IGSO and GSO algorithms are set as the population size
For BPNN model, the initial weights and thresholds are obtained by Marquardt Levenberg. The transfer functions of hidden layer and output layer are sigmoid function and linear function, respectively. Among them, the number of training iterations of BPNN model is 103, the MSE target is 10−2, and learning rate is 10−1. The number of hidden layer nodes in BPNN and ELM models is determined by step trial calculation. According to the size of training samples, the number of hidden layer nodes is increased in turn. The number of hidden layer nodes is determined when the classification accuracy reaches the maximum. The calculation results show that the number of hidden layer nodes of ELM model is 20.
For ABC-ELM algorithm [
For PSO-ELM algorithm [
IGSO-ELM starts from the initial hidden node 5 and gradually increases the number of hidden nodes, which optimizes the sample classification accuracy. Determine the number of hidden nodes of IGSO-ELM as 20, which is shown as Figure
Prediction classification accuracy corresponding to increasing number of hidden nodes in IGSO-ELM.
In order to compare the convergence effect of IGSO-ELM model, GSO-ELM model, PSO-ELM model, and ABC-ELM model, Figure
Relationship between MSE and the number of iterations.
In Table
Classification Accuracy of 6 machine learning models.
Name | Best (%) | Worst (%) | Mean (%) | Var (%) |
---|---|---|---|---|
BPNN | 50.9091 | 36.3636 | 40.6153 | 32.4702 |
ELM | 81.8182 | 63.6364 | 74.2518 | 33.3884 |
ABC-ELM | 89.0909 | 80.0000 | 83.9580 | 10.6115 |
PSO-ELM | 90.9091 | 78.1818 | 87.2727 | 11.3264 |
GSO-ELM | 90.9091 | 80.0000 | 85.3286 | 9.4113 |
IGSO-ELM | 92.7273 | 85.4545 | 88.8951 | 6.7586 |
Test set output results of six algorithms.
Firstly, it can be seen from Table
Secondly, compared with ABC-ELM model,PSO-ELM model,GSO-ELM model, and IGSO-ELM model, BPNN model and ELM model which are two kinds of single hidden layer feedforward neural network have lower classification accuracy. The best classification accuracy of BPNN model is just 50%, and the best classification accuracy of ELM model is only 81%, which shows that it is a very correct choice to optimize single hidden layer feedforward neural network model by using various swarm intelligence optimization algorithms to improve the classification accuracy.
Finally, it can be seen from Table
In order to better illustrate the advantages of IGSO-ELM model, three commonly used evaluation indexes of machine learning, which are MAE, RMSE, and
It can be seen from Table
Performance of 6 machine learning models.
Name | MAE | RMSE | CA | |
---|---|---|---|---|
BPNN | 0.4909 | 0.7006 | NaN | 50.9091 |
ELM | 0.1818 | 0.4264 | 0.3686 | 81.8182 |
ABC-ELM | 0.1091 | 0.3303 | 0.5952 | 89.0909 |
PSO-ELM | 0.0909 | 0.3015 | 0.6668 | 90.9091 |
GSO-ELM | 0.0909 | 0.3015 | 0.6261 | 90.9091 |
IGSO-ELM | 0.0727 | 0.2697 | 0.7105 | 92.7273 |
In Table
Because of the lack of reliable financial statements and operating records, small and micro enterprises are facing financing difficulties, which has become an important factor restricting the development of small and micro enterprises. The credit status of small and micro enterprises plays an important role in their financing, so it is of great significance to study the credit risk measurement of small and micro enterprises. Therefore, a credit risk measurement model was proposed in this paper based on the improved GSO algorithm and ELM algorithm. Firstly, according to the growth and development characteristics of small and micro enterprises in the big data environment, the formation mechanism of credit risk of small and micro enterprises was analyzed from the perspective of granularity scale driven by big data, cross-border correlation, and global perspective, and a comprehensive evaluation index system was built by summarizing and analyzing the factors influencing credit evaluation indicators. Secondly, the traditional GSO algorithm was improved by good point set theory and variable step size strategy, and 10 benchmark standard functions were selected to test effectiveness of the IGSO algorithm. Experimental results show that the IGSO algorithm had great improvement in stability, accuracy, and convergence speed compared with the GSO algorithm. So, the integrated algorithm based on the improved GSO and ELM was established. The number of hidden layer nodes in ELM is determined by step-by-step trial method, and then the weight and threshold of ELM are optimized by the improved GSO algorithm in this integrated algorithm. Finally, ELM is a simple and effective method to establish the credit risk measurement model of small and micro enterprises which is verified by simulation experiment. Thus, a credit risk measurement model of small and micro enterprises based on IGSO-ELM integrated algorithm was proposed. The sample data of small and micro enterprises in China are collected, and the simulation experiment is carried out with MATLAB software tool. The experimental results showed that the model was effective, feasible, and accurate compared with the BPNN model, ELM model, ABC-ELM mode, PSO-ELM model, and GSO-ELM model. The research results of this paper can provide a reference for solving the problem of credit risk measurement of small and micro enterprises and also lay a solid foundation for the theoretical research of credit risk management.
The [.xlsx] data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 6 months after publication of this article, will be considered by the corresponding author.
The authors declare that they have no conflicts of interest.
This study was supported by the fund of Philosophy and Social Science Planning Project of Anhui Province (No. AHSKY2018D09).