Personal Credit Default Discrimination Model Based on Super Learner Ensemble

Assessing the default of customers is an essential basis for personal credit issuance. )is paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six kinds of single classifiers such as logistic regression, SVM, and three kinds of homogeneous ensemble classifiers such as random forest to build a base classifier candidate library for Super Learner. )en, we use the ten-fold cross-validationmethod to exercise the base classifier to improve the base classifier’s robustness.We compute the base classifier’s total loss using the difference between the predicted and actual values and establish a base classifier-weighted optimizationmodel to solve for the optimal weight of the base classifier, whichminimizes the weighted total loss of all base classifiers. )us, we obtain the heterogeneous ensembled Super Learner classifier. Finally, we use three real credit datasets in the UCI database regarding Australia, Japanese, and German and the large credit dataset GMSC published by Kaggle platform to test the ensembled Super Learner model’s effectiveness. We also employ four commonly used evaluation indicators, the accuracy rate, type I error rate, type II error rate, and AUC. Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness.


Introduction
e analysis of default is a useful credit risk assessment tool. Its core notion is to utilize observable borrowers' characteristic variables to build a classification model and predict new borrowers' default probability, for loan approval and quota for banks and other financial institutions. e determination of its interest rate provides a fundamental theoretical basis. Relevant research shows that even if the default discrimination model's prediction accuracy is increased by 1%, it will significantly reduce the possible losses due to bad debts and increase financial institutions' profitability [1].
More and more scholars have attached importance to assessing personal credit defaults, and a series of studies on default discrimination models have emerged. e representative approaches primarily rely on statistical methods and artificial intelligence. e personal credit default discrimination model started with traditional statistical methods. Typical models include the Z-score model [2], the probit analysis method [3], and the logistic analysis model [4]. However, these models may require strict assumptions.
Alternatively, it is difficult to deal with the increasingly large index systems, which have limitations in practical applications. e development of artificial intelligence has resulted in machine learning techniques that can handle high-dimensional data without strict assumptions, such as artificial neural networks (ANNs) [5][6][7], support vector machines (SVMs) [8], and decision tree (DT) [9]. Also, these singleclassifier learning algorithms can better solve nonlinear problems and improve prediction accuracy. However, the single classifier has its disadvantages and limitations when dealing with different credit risk assessment problems. To improve the single classifier's performance, the research of machine learning technology gradually develops into the ensemble model, which overcomes the disadvantages of the single classifier. e ensemble classifier presents better performance than a single classifier [10], so it has become a research hotspot of current personal credit scoring models. Compared with the single classification models, although the ensemble models are costly, the benefits of the personal credit scoring model's accuracy improvement make up for higher operating costs. Hence, the ensemble models are more suitable for today's financial institutions [11].
In formulating credit scoring models, single classifiers such as logistic regression, decision trees, and support vector machines often serve as the base classifiers for ensembled models [12,13]. e central combination of base classifiers falls into three types: Bagging [14], Boosting [15], and Stacking [16], where Bagging and Boosting are the typical homogeneous ensemble, and Stacking is a heterogeneous ensemble. Previous studies found that ensembling multiple differentiated single classifiers can better solve the problem of overfitting single classifiers and obtain better prediction performance and generalization on most unbalanced datasets. e heterogeneous ensemble ability, composed of multiple different base classifiers, is increasingly becoming a research hotspot [17].
Although the Stacking heterogeneous ensemble algorithm has been an amazingly effective method in the default discrimination, it has received relatively little attention [18]. Besides, existing works rarely considers the selection of base classifiers and usually directly integrates all candidate base classifiers. erefore, some classifiers with poor prediction performance reduce the performance of the final ensembled model. e negative impact leads to the development of selective ensemble. e selective ensemble chooses some base classifiers with excellent performance and gives different weights according to the classifier's performance. Lessmann et al. focused on the selective ensemble model and proved that the selective ensemble model performs relatively well [11]. However, the selective ensemble model has not received attention in the prior default discrimination research. When constructing a default discrimination model, no model is the best because the studied problem, the data structure, and the evaluation indicators complicate the model formulation. erefore, if an ensembled model can select the base classifier independently according to different conditions, it can optimize and integrate, thus improving the model's classification performance and robustness.
In order to make up for the shortcomings of the above research, we consider using Super Learner, the evolutionary algorithm of the Stacking algorithm, to improve the classification performance of the model and the adaptive adjustment ability of different datasets.
e Super Learner ensemble algorithm is a heterogeneous ensemble algorithm based on loss minimization, which was proposed by Van der Laan et al. and proved to be optimal [19].
Super Learner integrates a variety of commonly used classification models based on cross-validation theory. It solves the optimal weighting model of the base classifiers by minimizing classification loss. e automatic selection of the classification model is realized to ensure the classification accuracy and robustness of the ensemble model. e Super Learner ensembled algorithm has been successfully studied extensively in medicine and social sciences [20][21][22]. Although Super Learner has proven to perform well in many environments, its performance of evaluating personal credit default discrimination requires further study.
is paper applies the Super Learner to the personal credit default discrimination and utilizes its data adaptability to better deal with unbalanced credit datasets. e classifiers commonly used in default discrimination models are combined, including logistic regression and lasso regression, K-nearest neighbor, SVM, neural network, decision tree, random forest, GBDT, and XGBoost models, nine prediction methods to build a personal credit default discrimination model with better prediction performance robustness. e rest of the paper is organized as follows. Section 2 reviews the application of the ensembled model in the field of credit default discrimination. Section 3 details the principle of the personal credit default discrimination model based on the Super Learner algorithm. Section 4 elaborates on the Super Learner-based heterogeneous ensemble model. Section 5 provides real cases to test the proposed model based on the Super Learner and experimental results analysis. Section 6 concludes.

Related Work
is section presents the related work of ensemble classification model construction.

Homogeneous Ensemble.
Homogeneous ensemble methods, such as Bagging or Boosting, use a set of classifiers fitted by a classification algorithm. Although the same classification algorithm differs in the type of training datasets or input features used, making a series of classifiers finally fitted out has its characteristics. A single classifier is usually combined by majority voting or weighted voting to obtain the final classification [23]. Previously, the default discrimination research mostly established models in the form of isomorphic sets. Many practical experiences and theories have shown that the combination of models improves prediction accuracy. Paleologo et al. proposed sub-gagging ensemble for a high imbalance in credit dataset categories and missing data and conducted empirical research using samples from IBM Italian customers. e results show that using a decision tree as the most basic classifier sub-gagging ensemble improves classification performance while keeping the model simple and reasonably interpretable [24]. Yu et al. proposed an extreme value ensembled machine learning method based on a multilevel deep belief network (DBN). ey conducted empirical research using the Japanese credit dataset from the UCI database, proving that the proposed method effectively improves classification accuracy [25]. Cano

Heterogeneous Ensemble.
Compared with the homogeneous ensemble method, the heterogeneous ensemble method combines a variety of different classification algorithms. e idea is that different classifiers manage the same problem in different ways, thereby complementing each other by increasing the base classifier's diversity and predictions. In addition to weighted or straightforward voting, independent ensembled learning can also use a more sophisticated method to incorporate a single classifier. e Stacking algorithm is a typical heterogeneous ensembled algorithm. Tsai et al. pointed out that heterogeneous ensemble is superior to a homogeneous combination in predicting performance [31]. Nascimento et al. believe that different classification algorithms have different representative deviations, making the base classifier have output diversity, making it easy to adapt to different datasets [32]. Li et al. designed a multiround ensembled learning model based on heterogeneous ensemble framework to predict the default risk for the uneven distribution of credit data samples in the P2P lending market, i.e., the scarcity of default sample data. e real credit data of the P2P loan market was tested and proved the model [33]. Guo et al. pointed out that credit ratings' accuracy affects financial institutions' risk control and profitability. To improve the prediction performance and adjust to different credit datasets adaptively, it introduced a technique based on the statistics and machine learning ensembled model of a multistage adaptive classifier. It validated its effectiveness using three actual datasets in UCI [34]. Papouskova et al. stated a two-stage consumer credit risk model based on heterogeneously ensembled learning, which in turn models default probability and default risk exposure, thereby modeling the overall credit risk of consumer loans based on expected losses [35]. Plawiak et al. offered a new deep genetic hierarchical network credit scoring model that integrates four primary learners of support vector machine, K-nearest neighbor, probabilistic neural network, and fuzzy system. ey employed the German credit dataset in the UCI database and verified the model [36]. e previous research shows that, on the problem of default discrimination, heterogeneous ensemble models have better prediction performance and data adaptability than single classifiers and homogeneous ensemble models. In heterogeneous ensembled learning, the core means of improving prediction performance is to make the base classifier better and different, that is, to improve the accuracy and diversity of the base classifier as much as possible, thereby improving the performance of the ensembled model [37]. In existing research, however, multiple base classifiers are usually directly fitted for ensemble without considering the base classifier's selection, which makes the base learner with weak classification effects affect the ensembled model's performance. erefore, if an ensemble model can integrate various excellent and different classifiers and independently select the appropriate classifier for the problem's data structure and optimize the ensemble, better classification performance and robustness can be obtained.
In response to the prior problems, this paper applies the ensembled Super Learner model that performs well in disease prediction to the field of personal credit default discrimination. e Super Learner algorithm is an evolution of the Stacking algorithm. Previous research results indicate that the Super Learner integrated model proposed by Van der Laan et al. can independently select the base classifier according to the data structure of the dataset and the performance of the classifier, so as to improve the classification performance and robustness of the model [38][39][40][41][42][43][44]. Besides, if none of the candidate models in the Super Learner's classifier algorithm library can achieve the prespecified accuracy, the performance of the Super Learner is at least as good as the best algorithm in the candidate algorithm library, or it gradually approaches the best algorithm. By including more prediction algorithms in the candidate estimation library, Super Learner will outperform any of its competitors progressively, showing better performance. is paper introduces Super Learner into the field of default discrimination, based on minimizing the classification loss, predicting the borrower's credit status, and improving the performance of personal credit default discrimination.

The Principle of the Personal Credit Default Discrimination Model Based on the Super Learner Algorithm
Super Learner is a heterogeneous ensemble algorithm based on loss minimization proposed by Van der Laan et al. and theoretically proves the prediction effectiveness of the Super Learner algorithm [19]. e algorithm uses ten-fold crossvalidation to train a variety of differentiated base classifiers.
With the goal of minimizing the weighted total loss of all base classifiers, an optimal weighted combination of base classifiers that minimizes cross-validation classification loss is obtained. Construct a Super Learner heterogeneous ensembled classifier to improve the prediction accuracy and robustness of the model. e basic steps of Super Learner are as follows.
Mathematical Problems in Engineering 3 First, build a candidate algorithm library containing multiple base classifiers. e algorithm library should include multiple algorithms commonly used to solve a problem.
e algorithm in the algorithm library can be either a simple single classification model or a complex model with hyperparameter settings. As long as the algorithm can output a fitted prediction function based on the observation data, it is considered to be a classification algorithm.
en, use the ten-fold cross-validation method to train multiple base classifiers in the candidate algorithm library to improve the robustness of the classification model, and calculate 10 sets of prediction results for each base classifier, and calculate each based on the deviation of the prediction results from the true value. For 10 sets of loss of a base classifier, the total loss of the base class is calculated according to the 10 sets of losses of each base classifier. e smaller the total loss of the base class, the better the classification performance. e ensemble process should give the base classifier a larger loss. In order to ensure the classification performance of the ensemble classifier, the greater the total loss, the worse the discrimination performance, the smaller the role of the classifier in the ensembled classifier, even the classifier is not considered in the ensemble process.
Finally, with the objective of minimizing the total weighted loss of all base classifiers, the optimal weighting model of base classifiers is established with the constraint that the weight of each base classifier is nonnegative and the sum of the weights is 1. Fit the selected base classifier in the complete dataset and combine the optimal weights to construct the Super Learner ensembled classifier.
e Super Learner ensembled algorithm has been extensively studied in the fields of medicine and social sciences [20][21][22]. Existing research usually uses Super Learner to integrate a variety of commonly used learning algorithms in this field to improve patient mortality prediction or predict personal behavior, etc. and has proved through empirical evidence that its prediction performance is good. In view of its excellent predictive performance, this study applies the Super Learner ensembled algorithm to the field of personal credit, thereby constructing a personal credit default discrimination model with better classification performance and robustness.
In the personal credit problem, assume that the data structure of the current credit dataset is where O i represents the observation data of the ith borrower and P 0 its probability distribution. X i � (X i1 , X i2 ,..., X ip ) represents the p observation index of the i-th borrower, and Y i represents the default state of the borrower, which is a binary classification result variable.
In forecasting research, researchers usually use a variety of forecasting algorithms to estimate the probability distribution and find the algorithm with the best forecasting performance. It can be found from previous research that for a particular data distribution, the prediction performance of a particular algorithm is superior to other prediction algorithms. However, in personal credit research, researchers cannot know in advance which classification algorithm is most suitable for a given credit dataset. Researchers usually evaluate the performance of the prediction algorithm through the loss function. A prediction algorithm has the best expected performance for the loss function used, and it is regarded as the best prediction algorithm. Common loss functions include absolute error loss function, squared error loss function, and negative log loss function suitable for binary dependent variables. Super Learner refers to the value of the loss function as risk, and the process of minimizing risk is the process of minimizing classification losses in personal credit problems and then constructs the optimal weighted combination of classifiers. e estimated function of choosing the best default discriminant model is Q 0 (X) � E 0 (Y|X), and then the objective function is expressed as the minimum value of the expected loss: (1) e specific principles of Super Learner are as follows: (1) Define the default discrimination candidate classifier library K and express the number of algorithms in the library as k(n) and the base classifier as Ψ k (X), k � 1, 2, . . ., k(n).
Repeat the above steps V times, and put the prediction results obtained by each algorithm into an (2) (6) On all admissible α combinations, select the vector value α that minimizes the cross-validation risk of all candidate estimators K k�1 α k Ψ k , and obtain the optimal weight vector α: (7) According to the classifier selected in the previous step, fit the base classifier Ψ k (X) on the complete credit dataset, k � 1,...,K, and combine with the optimal weight vector α obtained in the previous step to construct the Super Learner ensemble model: e construction flowchart of the Super Learner is shown in Figure 1 [46]. e candidate base classifier library constructed based on sorting out relevant literature uses first single classifiers and adds an ensembled classifier with excellent performance to further enhance the prediction performance and robustness of the model. e classification algorithm library set in this article includes logistic regression (LR), lasso regression (Lasso), K-nearest neighbor (KNN), support vector machine (SVM), neural network (NN), decision tree (DT), random forest (RF), GBDT, and XGBoost algorithms. Among them, logistic regression is the most commonly used single classifier in credit scoring and has a good prediction performance. At the same time, random forest, GBDT, and XGBoost are the more typical applications in Bagging ensemble and Boosting ensemble, respectively, and the prediction performance as compared to traditional single classification. e classifier is excellent, so these nine classifiers are used as the base classifier to get better prediction performance and robustness.

Establishment of Personal Credit
For the comparison of all datasets, this paper keeps the base classifier candidate library unchanged. In order to further improve the performance of each classifier, this paper uses the enumeration method to optimize the parameters of each classifier [47]. at is to say, the possible values of each parameter are arranged and combined, and each possibility is tried through loop traversal, and the best parameter is regarded as the final parameter combination. e brief introduction and parameter setting of each classifier are shown in Table 1.

Calculate the Total Classification Loss of the Candidate
Base Classifier. Cross-validation divides the original dataset into a training set and a test set. e training data train the classifiers, while the test data evaluate these classifiers' performance. Using cross-validation can enhance the robustness of the classification model. It divides the complete dataset into ten equal-sized subsets and uses nine of them as the training set. en, the cross-validation trains the base classifier in the candidate classifier library, builds a prediction model, and tests the prediction performance through the test data. After storing the prediction results, the procedure repeats ten times until each becomes a verification set, thus obtaining ten sets of prediction results. In the following, the cross-validation calculates the base classifier's classification loss by using the true attributes and prediction results of each set of data to obtain the total loss of the base classifier. e above process terminates until the total loss of each base classifier in the candidate classifier library is obtained. e smaller the total loss of the base classification, the better the classification performance. e ensemble process gives the base classifier a more significant weight to ensure the ensembled classifier's classification performance. e more significant the total loss, the worse the discrimination performance, and the smaller role the classifier played in the ensembled classifier. e classifier is not even considered in the ensemble process to obtain better prediction performance of the ensembled model.

Solve the Optimal Weight of Each Base Classifier and Build the Super Learner Ensembled Model.
e choice of the base classifier is an integral part of the ensembled model. In applying the model, if fixing the base classifier, the model cannot adaptively select a more suitable base classifier for the data structure of different datasets, failing to obtain a better ensembled model. In the existing research, most ensembled credit scoring models usually directly integrate all the base classifiers constructed and rarely consider base classifiers' choice. However, since each model's weight is equal, the base classifier with poor prediction performance will affect the Mathematical Problems in Engineering performance of the final ensembled model. e emergence of selective ensemble overcomes this shortcoming. It selects the best-performing base classifier for ensemble or gives different base classifiers different weights. It has become a research hotspot in the field of ensemble learning.
e Super Learner theoretical framework aims to build an algorithm library containing weighted combinations of multiple prediction models and expects one of these weighted combinations to perform better than each prediction algorithm. Based on this, a candidate algorithm library can be given in advance for the problem to be solved. Hence, we can construct an infinite set of weighted candidate combination families and select the weighted optimal combination by minimizing cross-validation. We have obtained the total loss of each base classifier in the candidate base classifier library in the previous step. e goal is to minimize the weighted total loss of all base classifiers, and the weight of each base classifier is nonnegative. e sum is one as the constraint condition, and it establishes the optimal weighting model of the base classifier, selects the weight vector value that minimizes the total weighted loss of all base classifiers among all allowed weight combinations, and finds the optimal value of each base classifier weights. Different datasets have different data structures. It may even happen that some classifiers are not considered during the ensemble process to make the prediction performance of the    ensembled model better; that is, the weight value of the classifier is 0. According to the credit dataset's data structure, the most suitable base classifier in the candidate base classifier library is chosen, and the optimal weight of each base classifier is obtained. Each algorithm is refitted in the complete dataset to generate the final base classifier and weighted combination with each classifier's weight to generate the Super Learner ensembled model.

Experimental Dataset.
In the empirical study, four real credit datasets are used to evaluate the performance of the model. It includes three credit datasets of Australia, German, and Japanese in UCI database [48], and a large real credit dataset of Give Me Some Credit (GMSC) provided by Kaggle platform. e details of the datasets are shown in Table 2

Data Preprocessing.
In practice, missing data and outliers inevitably exist in credit data, which is not conducive to the construction and application of the model. Data preprocessing can make the data completer and more normative, which is an indispensable step in the modeling process. In this study, the data preprocessing includes three steps: missing value filling, qualitative index virtual coding, and data standardization. After preprocessing the original data through these steps, new data are obtained.
In multistep data preprocessing, the first step is to fill in missing values in the data. According to the type of missing data in the original dataset, fill them separately, create a new category for the category feature to replace the missing value in the feature, and use the mean value to replace the numerical feature's missing value. en, the second step of virtual encoding is performed. Because of the incomparability between unordered feature values and multicategory feature values, the coding of virtual variables is used to quantify the categorical variables according to the feature category. Generally, a feature with a category k is encoded as a set of k-1 derivative dummy variables, which can effectively avoid multiple collinearities and can also represent all categories within the feature (that is, the benchmark comparison category is set to 0 when the k-1 dummy variables are all 0, and it is the base class). Finally, the third step of data normalization is performed to eliminate the numerical difference between features. In classifiers, support vector machines and other classification models based on distance metrics are extremely sensitive to the difference in the order of magnitude between data. A vast difference in the order of magnitude between data will cause a significant classification error. To avoid the influence of the magnitude difference between the data on the classification results, before building the model, the dataset should be standardized through standardization.
is paper uses the Z-score standardization method to standardize the data [39,41]. e Z-score standardization method is shown in where x ′ represents the processed value, x is the original value, x denotes the mean of the feature, and s stipulates the standard deviation of the feature. In this paper, four credit datasets are used in the empirical study, which are processed according to the multistep data preprocessing method described above. First, fill in the missing values in the data. For the datasets including missing values, Japanese datasets, and GMSC datasets, the missing values of classification indicators are filled according to the new categories, and the missing values of numerical characteristics are filled according to the average values of corresponding indicators, and the complete dataset is obtained. e Australian dataset and the German dataset are already complete datasets and do not need to be processed. en, the classification variables are processed. e Japanese dataset contains classification indexes and needs dummy variable processing. e Australian dataset and the German dataset have processed the classification indicators, while the GMSC dataset has numerical indicators, which do not need to be processed separately. Finally, based on the complete dataset obtained after two-step data preprocessing, the Z-score of numerical indicators is standardized to get the complete dataset.
After multistep data preprocessing, the dataset is divided into the total training set and the test set according to the ratio of 8 : 2. It means 80% of the data is used to train the model, and 20% of the data is used to verify the effectiveness of the model. To further improve the performance of each model, the total training set containing 80% data is divided into two parts according to the same proportion, 80% is used as the training set and 20% as the verification set, and the enumeration method is used to adjust the model parameters.

Evaluation Indicators.
ere are many indicators to evaluate the classification performance of the model. Commonly used indicators are accuracy rate, recall rate, F value, and AUC value, among others. Besides, four Mathematical Problems in Engineering evaluation indicators are used in this paper to estimate the performance of the model, accuracy rate, AUC, type I error rate, and type II error rate, which are obtained based on the confusion matrix shown in Table 3. According to Table 3, true positive (TP) means that if the actual values are zero, then the prediction is zero. However, false negative (FN) means that the actual value of zero may result in the prediction value of one. Also, false positive (FP) gives a prediction value of zero, although the real value is one, while the true negative (TN) presents the same results in any cases.
Based on these factors, the accuracy, the type I error rate, and the second type error rate are expressed as follows: e ROC curve is a curve describing the binary classifier system's performance when its recognition threshold changes. e curve is created by plotting the true positive rate and false positive rate under various threshold settings. AUC is defined as the probability that the prediction model ranks randomly selected positive instances higher than randomly selected negative instances. Hence, AUC can be computed as the area under the ROC curve. For AUC, the larger the value, the better the classifier performance. In the literature on credit risk, AUC is a suitable measure for performance evaluation due to its robustness to unbalanced data. e calculation method of AUC is as follows: where rank ins i represents the index number of the i-th sample; that is, when ranking the probability scores in increasing order, rank ins i is in the rank's position. M and N are the number of positive samples and negative samples, respectively. ins i ∈pos means only to add up the index numbers of positive samples. In the above evaluation indicators, the type II error rate refers to the misjudgment of borrowers with a high probability of default as being able to borrow, which will cause more significant losses for banks and financial institutions and will focus on this article. Besides, although the indexes currently used to evaluate the classification performance of the model are accuracy rate, among others, for unbalanced samples, AUC can better evaluate the model's performance, so AUC will also act as a critical evaluation index.

Experimental Results and Analysis.
Using the three real credit datasets of Australia, German, and Japanese in the UCI database, GMSC datasets of Kaggle platform, and the constructed Super Learner ensemble model, the selection of base classifiers in each dataset is obtained. e optimal weight coefficients of each base classifier are calculated. In solving the optimal weights, the smaller the total loss of the base classifier, the better the classification performance. e ensemble process should give the base classifier a more considerable weight to ensure the final classification performance of the ensembled classifier; the more the total loss, the worse the discrimination performance, and the smaller the role of the classifier in the ensembled classifier, and if the weight coefficient of the classifier is 0, it means that the final ensembled model has better discrimination accuracy. e classifier is not considered in the ensemble process. e experiment uses R3.5.5 for empirical analysis and uses the Super Learner package to fit the relevant models.
In the Australian dataset, the final ensemble model uses only three basic classifiers of LR, KNN, and GBDT, and other classifiers with weighting coefficients of 0 did not participate in the final ensemble. Among the three classifiers used, GBDT has the most significant weight coefficient of 0.7457, indicating that GBDT plays the most crucial role in the Super Learner ensembled model built on the Australian dataset. In the German dataset, four classifiers of LR, SVM, RF, and GBDT constitute the final Super Learner ensembled model, of which RF plays the most extensive role. Based on the Japanese dataset, the Super Learner ensemble uses seven base classifiers, including LR, KNN, SVM, NN, RF, GBDT, and XGBoost process, and XGBoost plays the most critical role. In the GMSC dataset, the final integration model uses LR, KNN, DT, RF, GBDT, and XGBoost, among which GBDT plays the most critical role. e above results show that Super Learner integrates different base classifiers for credit datasets with different data structures. Also, according to each classifier's role in the  Australian  690  307  383  6  8  14  German  1000  700  300  13  7  20  Japanese  690  307  383  11  4  15  GMSC  150000  139975  10025  0  10 10 ensemble process, Super Learner gives different weights to different classifiers. Even, to ensure the classification performance of the ensembled classifier, the proposed Super Learner does not consider specific classifiers with little effect in the ensemble process. is design also reveals that the Super Learner ensembled model has good data adaptability and can choose the classifier independently according to the dataset's situation to build a Super Learner with better predictive performance. e ensembled model is selected instead of integrating all candidate classifications such as traditional ensemble learning, so that some base classifiers with inadequate classification affect the ensembled model's accuracy. Table 4 shows the classifiers' adaptive selection results on each dataset, that is, the classifier used to construct the final model and the weight coefficients of each classifier. e base classifier and its corresponding weight coefficients are determined, combined with the base classifier constructed in the complete training set, thus constructing the final Super Learner model. e performance of various comparison models and the proposed Super Learner model on different datasets is shown in Table 5. Table 5 shows the running results of the proposed model and 10 comparison models on the same datasets and compares them with four evaluation indicators. e top three classifiers on each evaluation indicator are highlighted in bold. Because different datasets have different data structures and evaluation indicators, no classifier is optimal. Each classifier has a different performance on each dataset, and each dataset has its own suitable classifier. Super Learner and neural networks show excellent overall discriminative performance on the Australian dataset, where the discriminatory accuracy of the classifier is 0.8913. On the German dataset, the accuracy of heterogeneous integration model composed of majority voting is 0.785, and that of super learner is 0.78. Super Learner is slightly inferior to GBDTon the Japanese dataset with an accuracy rate of 0.913, but it still maintains high overall accuracy. Another vital aspect of the classification model is that the model can maintain excellent performance on multiple datasets. On the Australian dataset, Super Learner and XGBoost performed better overall; on the German dataset, the heterogeneous integration model composed of Super Learner and majority voting performs better; on the Japanese dataset, the top two classifiers overall performed were GBDT and Super Learner; on the GMSC dataset, the accuracy of Super Learner is the highest, which is 0.9369, and GBDT and XGBoost are the second.
us, in these four credit datasets, the overall discriminant performance of the Super Learner we constructed maintains the best or the second best. Besides, Super Learner has better robustness than other models and can better adapt to different datasets. e accuracy of discrimination of ensembled algorithms such as random forest and XGBoost is also high, but it is slightly inferior to the Super Learner algorithm used in this paper from the perspective of stability.
On the one hand, the above results prove the accuracy and robustness of the constructed Super Learner model; on the other hand, random forest, GBDT, and XGBoost are the mainstream models in Bagging and Boosting ensemble algorithms. e heterogeneous ensemble model composed of majority voting is also the most common heterogeneous ensemble model. Its good performance further shows that the ensemble model is better than a single classifier in most cases. Figures 2-9 intuitively compare each classifier's performance on the accuracy rate, AUC, type I error rate, and type II error rate of each evaluation index on the datasets. Figures 2 and 3 show the performance of ten comparative models and Super Learner ensemble classifier on the Australian dataset. e accuracy and AUC of each classifier show the same trend. When the two metrics of a classifier are larger, the classification loss will be smaller. On the contrary, the type I error rate and the type II error rate are on the contrary. e smaller the two index values of the classifier, the lower the error rate of the classifier. Figure 2 shows the comparison results on accuracy and AUC of each classifier. Super Learner has the highest accuracy, while AUC is not the highest but remains in the top three. It can be seen from Figure 3 that the type I error rate and the type II error rate show an opposite trend. When the type I error rate of the classifier is small, the type II error rate is relatively large, while our Super Learner has good performance in both types of error rates, both of which remain at the best or the second best.
From Figures 4 and 5, we can clearly see that the Super Learner ensemble classifier has the highest accuracy and AUC compared with other classifiers on the German dataset, indicating its excellent discrimination performance. Although Super Learner's performance in the type I error rate and the type II error rate is not the best, it remains in the top three. Figures 6 and 7 show the performance of the ten comparison models and Super Learner ensemble classifiers on the Japanese dataset.
As can be seen from Figure 6, Super Learner is slightly inferior to GBDT in accuracy and AUC. Figure 7 shows the comparison of the type I error rate and the type II error rate of each classifier. Although Super Learner's performance in the first and second type error rates is not the best, it still keeps in the top three. Figures 8 and 9 show the performance of ten comparison models and Super Learner ensemble classifiers on the GMSC dataset. Expressed by Figure 8, Super Learner is the best in accuracy and AUC. Because the GMSC dataset is an extremely unbalanced dataset, the ability of each classification model to identify a few default samples is poor. Figure 9 shows the comparison results of each classifier on the type I error rate and the type II error rate. From the type I error rate, the performance of single classification is better, while the performance of the type II error rate is worse. However, the ensemble model performs better in the type II error rate, which means its ability to identify default samples is stronger. Although the performance of the Super Learner model is not the best in the second kind of error rate, it remains in the second place.    In most cases, although the implementation of a single classifier is relatively simple, its classification performance is inferior to its ensembled structure. At the same time, Super Learner heterogeneous ensemble shows the potential to improve the default discrimination model's performance. It achieved the best or remained in the top three among the four evaluation indicators of all datasets, compared with other models. Based on maintaining a high discriminating ability, the robustness of the model is guaranteed, and it is more suitable as a useful tool for distinguishing potential default borrowers from banks and other financial institutions in reality.

Comparison of Other Studies.
According to the principle in Section 3 of this paper, the construction principle of the proposed Super Learner heterogeneous ensemble model is to consider the robustness of the base classifier first and the accuracy of the base classifier second. erefore, the Super Learner heterogeneous ensemble model may lose part of its accuracy in ensuring the robustness of default judgment. is paper compares the results of other researchers on the same credit dataset, further verifies the characteristics of the Super Learner heterogeneous ensemble model, and proves the applicability of the model in the field of credit.
e specific results are shown in Table 6. From the fifth and sixth columns of Table 6, it can be clearly found that on the Japanese and GMSC datasets, the performance of the Super Learner model in each indicator is better than other comparative models. Columns 3 and 4 in Table 6 show the performance of the Super Learning model and the comparison models on the Australian and German datasets. Although the Super Learning model is not the best, it remains the second best. From the above analysis, it can be seen that the Super Learner heterogeneous ensemble model  shows good robustness and maintains the top three ranking in all indicator performance. In addition, it also has good performance in accuracy, which is only lower than the GSCI model in the Australian dataset and lower than the Bstacking model in the German dataset. is shows that the Super Learning heterogeneous ensemble model constructed in this paper has excellent robustness and good accuracy.

Conclusion
Establishing a borrower's default discrimination model is an essential task for banks and other financial institutions to make loan decisions. erefore, the discriminative performance and robustness of the default discriminant model are crucial to financial institutions' profitability, such as banks.
In this study, we utilized the heterogeneous ensemble default discriminant model. e ensembled Super Learner model that determines the optimal combination of multiple base classifiers using cross-validation performs well in disease prediction in the medical field. is paper considers introducing the Super Learner algorithm into personal credit default evaluation research to build a default discrimination model with heterogeneous ensemble for better default discrimination accuracy and robustness.
First, we construct a base classifier candidate library containing a single classifier with better prediction performance such as logistic regression and SVM and a homogeneous isolator with better performance, such as a random forest. Second, we calculate the total classification loss of each base classifier. Ten-fold cross-validation is used to train multiple base classifiers in the candidate library separately, calculate ten sets of prediction results for each base classifier, and calculate ten sets of losses for each base classifier according to the deviation of the prediction results from the actual values. e ten sets of loss of a base classifier calculate the total loss of the base class. e smaller the total loss of the base class, the better the classification performance. e ensemble process should give the base classifier a more substantial weight to ensure the ensembled classifier's classification performance. e higher the total loss, the worse the discrimination performance, the smaller the classifier's role in the ensembled classifier, even the classifier is not considered in the ensemble process. en, we created the base classifier's optimal weighting model to solve the optimal weight of each base classifier. e model aims to minimize the weighted total loss of all base classifiers, with each base classifier's weight being nonnegative and the sum of weights being one as a constraint. us, we can find the optimal combination of the classifiers with the smallest cross-validation classification loss. Finally, in the empirical research, we use four commonly used indicators of accuracy, AUC, type I error rate, and type II error rate as evaluation indicators to verify the effectiveness and reliability of the Super Learner integrated model on UCI (Australian dataset, German dataset, and Japanese dataset) and GMSC datasets. In order to prove the superiority of the Super Learner

Data Availability
e data used to support the findings of this study are from a previous study [48].

Conflicts of Interest
e authors declare that they have no conflicts of interest.