An Ensemble Learning Method Based on an Evidential Reasoning Rule considering Combination Weighting

As an extension of Dempster–Shafer (D-S) theory, the evidential reasoning (ER) rule can be used as a combination strategy in ensemble learning to deeply mine classifier information through decision-making reasoning. The weight of evidence is an important parameter in the ER rule, which has a significant effect on the result of ensemble learning. However, current research results on the weight of evidence are not ideal, leveraging expert knowledge to assign weights leads to the excessive subjectivity, and using sample statistical methods to assign weights relies too heavily on the samples, so the determined weights sometimes differ greatly from the actual importance of the attributes. Therefore, to solve the problem of excessive subjectivity and objectivity of the weights of evidence, and further improve the accuracy of ensemble learning based on the ER rule, we propose a novel combination weighting method to determine the weight of evidence. The combined weights are calculated by leveraging our proposed method to combine subjective and objective weights of evidence. The regularization of these weights is studied. Then, the evidential reasoning rule is used to integrate different classifiers. Five case studies of image classification datasets have been conducted to demonstrate the effectiveness of the combination weighting method.


Introduction
Ensemble learning is a branch of machine learning, and using decision-making reasoning as a combination strategy helps ensemble learning to integrate better. e decisionmaking reasoning is the process of making a choice among many options and summarizing evidence to draw a conclusion, so it can help ensemble learning to integrate. e ER rule is a combination strategy [1], which can integrate classifiers through decision-making reasoning to obtain better results, so it is used as a decision-making strategy for ensemble learning in this paper.
In 2013, the ER rule, which considers the weight and reliability of evidence, was established by Yang and Xu [1]. e ER rule, which is an extension of the D-S theory [2,3], clearly distinguishes the importance and reliability of evidence and constitutes a general joint probabilistic reasoning process.
e counterintuitive problem encountered in Dempster's rule is solved by assigning weight and reliability to evidence. Due to the superiority of the ER rule, it has been applied in many different applications. A fault detection method based on the data reliability and interval ER rule was proposed, which improved the accuracy of fault detection [4]. A multi-index framework was proposed based on the combination of objective quality indicators, subjective expert judgements, and patient feedback based on the ER rule and improved model robustness under uncertain conditions [5]. e above research provides a theoretical basis and technical support for the research and application of the ER rule. e ER rule has become an important element in the field of nonclassical reasoning and can deeply mine classifier information through decision-making reasoning and improve the integration model. e weight and reliability of evidence can be as key parameters in the ER rule have been extensively studied in existing research. ere are generally three ways to set the weight of evidence in the ER rule. First, the weight of evidence in the traditional ER rule is given by expert knowledge. For example, the order relation analysis method (G1) was used to quantitatively evaluate the comprehensive intelligence of autonomous vehicles [6]. e weights used in the literature [7,8] were assigned by experts who actually assessed indicators from industrial processes. But this kind of human subjective weighting method is overpowering and prone to deviation in results. Second, some researchers have recently improved the setting of the weight of evidence. For example, Wu et al. [9] proposed popular calibration weighting methods such as generalized regression and generalized exponential tilting and verified their effectiveness.
ree methods of the maximum deviation method, coefficient of variation method, and entropy weight method were combined to establish an index optimal adaptive weighting model [10]. A novel weighted majority voting ensemble (WMVE) approach was proposed to assign different weights to classifiers [11]. A comprehensive evaluation method of power quality based on the ratio of the weighted rank sum ratio method was proposed to correctly sort and classify the power quality of each bus [12]. Zhou et al. [13] carried out physical and chemical analyses and used the coefficient of variation, mean and standard deviation in a model to predict the proximate and ultimate analysis and heating value from physical compositions. ese methods assigned weights in an objective way. But sometimes, heavily relying on the sample causes the weight setting to be inconsistent with the actual situation. ird, mathematical optimization was adopted to produce a dualobjective optimization model that was trained to obtain the weight of diagnostic evidence [14]. However, the mathematical optimization method needs a large number of samples, and it is difficult to obtain enough training samples in practice. Most of the proposed weighting methods are only applicable to a specific engineering background and are not universal. ere is still a lack of a general method to determine the weight of evidence. erefore, to set a more general and reasonable evidence weight, this paper proposes to use the combination weighting method to set the weight of evidence in the ER rule.
Combination weighting is a method that combines subjective weighting and objective weighting. is method can minimize the loss of information and can reflect both subjective and objective information, thereby making the determined weights more reasonable. At present, the combination weighting method has been widely used to weight the risk factors for fault modes to determine the risk priority of fault mode recognition [15]. is method was also used to determine the assessment factors of project managers, and a new method of manager performance evaluation was proposed [16]. e game theory method was proposed to combine the subjective weighting method and objective weighting method. e evaluation system of the educational informatization index was established [17], and it was verified in the Suzhou area. An et al. used the IGAHP-CRITIC combination weighting method to effectively integrate subjective and objective weight information, which effectively reduced the uncertainty and incompleteness of an equipment quality state assessment [18]. Since the setting of weights directly affects the accuracy of integrated learning using the ER rule, the complexity of objective things causes objective weighting to rely heavily on samples, and subjective weighting may lead to inaccurate weight settings due to the ambiguity of personal thought processes. erefore, research in the literature [19] verified the rationality of the combination weighting of the maximum difference by comparing the combination weighting method of addition, multiplication and the combination weighting method of maximum difference and provided a new idea for the combination of subjective and objective weights. e weighting method of maximizing the difference is used to evaluate the operation status of the electricity market in the literature [20].
is method achieved excellent results, reflecting its practicality and innovation. e effectiveness of the combination weighting method was proven by these examples. e application of the ER rule to ensemble learning is a new research field in the context of artificial intelligence. In ensemble learning, the generation of classifiers and the choice of combination strategies have a great impact on the quality of the results [21]. At present, a variety of methods have been applied as a combination strategy in the field of integrated learning, such as the voting method [22], average method [23], learning method [24], and D-S theory [25][26][27]. However, these methods have difficulty effectively mining the relationship between classifiers, and the ability to integrate classifiers is limited. Based on its advantages in improving uncertainty and reasoning, the ER rule can allow integrated models to obtain better results than strategies such as voting. Additionally, the ER rule does not change the traditional probabilistic reasoning model, so it is more effective for classifiers to integrate this kind of uncertainty information. As a powerful engine and information fusion mechanism [1], the ER rule can effectively improve the effect of ensemble learning.
In summary, a new ensemble learning model is constructed based on the ER rule. e contributions of this paper are described as follows: (1) e combination weighting method is used to determine the weight of evidence in the ER rule. It can further find the effective information in the process of combining evidence, take subjective factors and objective factors of evidence into consideration, and set more general and reasonable weight of evidence for the ER rule. (2) e ER rule is an information fusion method that does not contain labels. When setting weights, whether it is necessary to regularize the combination of weights has always been a matter of debate among existing researches. Accuracy is used as an indicator to study the regularization of weights in the ER rule and our own suggestions are given in this paper. e rest of this paper is organized as follows: using combination weighting method to determine weights for the ensemble learning model based on the ER rule is proposed in Section 2. e combination weighting method is constructed in Section 3. e ensemble learning model based on the ER rule is constructed in Section 4. Cases are used to verify the effectiveness of the combination weighting method is presented in Section 5. e conclusion of this paper is provided in Section 6.

Problem Definition
In the application of ensemble learning based on the ER rule, to demonstrate that the ER rule has a better improvement effect on the ensemble learning model, it is necessary to adopt appropriate weighting methods to set the weight of evidence. e main problems to be solved in this paper are as follows: Problem 1 construction of the combination weighting model and regularization of weight). At present, when assigning the weight of each classifier in ensemble learning, each classifier is used as a piece of evidence. Classifiers are usually generated by black-box models and are not interpretable, so it is difficult to provide accurate weight values through expert knowledge. In addition, the number of classifiers used in ensemble learning is small, and it is difficult to determine the weights in a mathematically optimized way. In the commonly used weighting method, the subjective weighting method is strongly influenced by human factors, and the objective weighting method depends a lot on samples. erefore, subjective weighting and objective weighting are combined in this paper. We established a combination weighting model for weights: where ω k is the combined weight of the classifier k, F(•) is a combination method of subjective weights and objective weights, ω k �→ is the subjective weight of the classifier k calculated by the subjective weighting method, ω k �→ is the objective weight of the classifier k calculated by the objective weighting method, and c is the parameter combination of the weight combination process. It is uncertain whether different combination weighting methods can make use of the advantages of subjective weights and objective weights, which needs to be verified in studies. e weight determined by the combination weighting method may exceed the range of ω k . For weight combinations that exceed the range of ω k , it is necessary to regularize this weight, but for weight combinations that do not exceed the range of ω k , whether regularization is required has always been a controversial issue when using the ER rule. Assuming there are n classifiers in total, the regularization process of weights can be described as where ω k is the regularized result of weight ω k and ω k is the combined weight of the classifier k.
Problem 2 (construction of the ensemble learning model based on the ER rule). In the current ensemble learning research, how to use a suitable combination strategy to combine weak learners into strong learners is the main problem. As an ensemble strategy, the ER rule can consider the relationship between classifiers, overcome the shortcomings of algorithms that only consider the numerical relationship of the classifier data, and significantly improve the effect of ensemble learning. Suppose a total of K classifiers are integrated, and the process of the ensemble learning model based on the ER rule is as follows: where c(k) is the classification result of the classifier k, k � 1, 2, . . . , K, f k (·) is the classification process of the classifier k, d is the dataset, and τ is the parameter set in the classification process. After the classification result is obtained, the ensemble result u of the model can be described as where g(·) is the integration process of the model, C(K) is the set of classification results of K classifiers, and ψ is the parameter set in the ensemble process. To solve the above problems, an ensemble learning model based on the ER rule considering combination weighting is proposed in this paper, as shown in Figure 1. It consists of two parts: combination weighting and the ER rule, which are introduced in detail in Section 3 and Section 4, respectively.

Construction of Weighting Methods
e weight of evidence is an important part of the ER rule, and the weighting method is the focus of this paper. Some representative weighting methods [28] are selected for definition in this section. e weights identified in this section will be used in Section 4.2.

Subjective Weighting.
e subjective weighting method is a method of weighting based on the subjective information of the decision-maker. is method can reflect the degree of importance the decision-maker attached to different attributes and can flexibly grasp the importance of each decisionmaking attribute. However, its flexibility and variability can carry too much subjective arbitrariness. e subjective weighting method chosen in this paper is the analytic hierarchy process (AHP).

Analytic Hierarchy Process.
e analytic hierarchy process is based on the experience of the decision-maker, combining quantitative analysis with qualitative analysis, judging the relative importance of each measurement target, and providing the weight of the decision plan. e first step is to build a hierarchical structure model. e relevant factors are decomposed into several levels from the top to bottom according to different attributes, and the goals, criteria, and objects of decision-making are divided into the highest level, the middle level, and the lowest level, respectively. e second step is that a judgement matrix is constructed. Based on the accuracy of each classifier, the classifiers are compared in pairs, a scale is given, and a matrix is formed. e third step is hierarchical single sorting and consistency checking. Hierarchical single sorting calculates the importance weight of the index related to the level for the index of the upper level according to the judgement matrix. en, a check is provided on whether the judgement matrix is a consistent matrix to determine the weight of the classifier. e fourth step is that the total taxi of hierarchy occurs, and consistency is checked. After calculating the comprehensive weights of all classifiers, they are sorted to obtain the importance results.

Objective Weighting.
e objective weighting method refers to the method of using the objective information of each attribute to determine the attribute weight. e decisions or evaluation results have a strong theoretical basis. However, the objective weighting method cannot address the subjective wishes of the decision-maker, which sometimes it goes against the actual importance of the attribute. e objective weighting methods used in this paper are the entropy weight method (EW), coefficient of variation method (COV), and CRITIC method.

Entropy Weight Method.
e entropy weight method is based on the amount of information contained in the data in the evaluation index system and determines the index weights by calculating the information entropy. e calculation steps are as follows.
Suppose there are I evaluation samples, K classifiers, and T categories. en there will be K × T evaluation indicators . , x Ij , and x ij is the j index value of the i samples (i � 1, 2, . . . , I; j � 1, 2, . . . , KT); the indicators used in this paper are all positive and the value y ij of the evaluation index is calculated through the standardization of Min − Max: en, according to the evaluation index value y ij , the standardization method is used to construct the evaluation matrix Y ij , and the process is shown in the following equation: After obtaining the standardized positive indicators, then their information entropy is computed. e calculation of the information entropy E j of the index j in the matrix is shown in the following equation: e normalization method is used to determine the weight of the abovementioned positive indicators. Since there are K classifiers and T categories, there are a total of K × T positive indicators. e information entropy of each group of data calculated by Equation (7) is: E 1 , E 2 , . . . , E KT , and the indicator weight is calculated by using Equation (8) to calculate the information entropy.
where ω kt is the weight of the indicator kt. erefore, the weight determined by the entropy weight method of the k classifier can be expressed as ω k , and each classifier recognizes the data of T categories to obtain T weight values. e weight of the index kt is calculated by using Equation (8). erefore, the weight of the classifier k is calculated by  e coefficient of variation method is based on the degree of variation between the current value of each index in the evaluation index system and the target value, and the index weight is determined by the method of calculating the degree of change of each index in the system. e calculation steps are as follows.
e steps in the entropy weight method in Section 3.2.1. Assuming there are I evaluation samples, K classifiers, and T categories, there are K × T evaluation indicators.
e evaluation index y ij is calculated by Equation (5).
After the evaluation index y ij is calculated, the mean value and standard deviation of the evaluation index are also calculated, and the coefficient of variation is calculated through the mean value and standard deviation of the index. e calculation process of the coefficient of variation v j of the index j in the matrix is as follows: where is the standard deviation of the evaluation index j and 1/I I i�1 y ij is the average value of the evaluation index j. en the coefficient of variation is normalized, and the weight of each evaluation index is calculated as follows: where ω kt is the weight of the indicator kt. e weight determined by the coefficient of variation method for classifier k can be expressed as ω k , which can be calculated by Equation (9).

CRITIC Method.
e CRITIC method is based on the contrast strength and conflict of each index in the evaluation index system, and it determines the index weight through the objective attributes of the data itself. e calculation steps are as follows: e steps in the entropy weight method in Section 3.2.1 are also applied here. e evaluation index y ij is calculated by Equation (5). en the amount of information C j of the index j in the matrix is calculated: where is the standard deviation of the index j and r ij is the correlation coefficient between index i and index j. e correlation coefficient is used to express the correlation between indicators. If the correlation between an indicator and other indicators is stronger, the conflict will be smaller, and the weight assigned to the indicator should be reduced. e weight of evaluation indicator is as follows: where ω kt is the weight of the indicator kt. e weight ω k of classifier k can be calculated by Equation (9). In view of the shortcomings of the above methods, the combination weighting method can effectively solve the problem of too strong subjectivity or objectivity of a single weighting method. At the same time, no one had used the combination weighting method to determine the weight of evidence of the ER rule. erefore, the combination weighting method is defined in Section 3.3.

Combination Weighting.
Combination weighting can effectively address the subjective and objective information of attributes, so it is more reasonable for setting the weight of evidence.
is paper uses three combination weighting methods, including additive synthesis, multiplicative synthesis and level difference maximization. After the combined weights are determined, they need to be regularized. e weighting process of the three combination weighting methods is shown in Figure 2.

Research on Additive Synthesis.
e combined weights based on the additive synthesis method mainly depend on the distribution of the subjective weights and the weight coefficient of the objective weights. At present, the commonly used methods include the minimum weighting method of subjective and objective weight deviation. is paper only studies the basic weight accumulation. Assuming there are n classifiers, the combination of subjective weight and objective weight of the classifier k can be described as . . , n as the new weight, e subjective weight ω → k and the objective weight ω ⟵ k of classifier k are added to obtain the combined weight ω k , and then it is judged whether ω k satisfies 0 ≤ ω k ≤ 1. If the combined weight ω k satisfies the condition, it will be retained, otherwise it will be regularized.

Research on Multiplicative Synthesis Method.
e multiplicative synthesis method has a multiplier effect, which leads to a higher combination weight of a classifier with a higher subjective and objective weight, and a lower combination weight with a smaller weight. Assuming there are n classifiers, the combination of subjective weight and objective weight of the classifier k can be described as Computational Intelligence and Neuroscience 5 e subjective weight ω k �→ and the objective weight ω k ⟵ of classifier k multiplied to obtain the combined weight ω k , and then it is judged on whether ω k satisfies 0 ≤ ω k ≤ 1. If the combined weight ω k satisfies the condition, it will be retained, otherwise it will be regularized.

Research on the Level Difference Maximization.
e combination weighting method with level difference maximization is a combination weighting method with a single index as the combined unit [29]. First, the multiple subjective weights and objective weights of the classifier are computed according to different weighting methods. en, the reasonable value space of the combined weights is determined according to the multiple weights; Finally, the maximum discrimination of the evaluation results is taken as the objective function, and the index's reasonable value interval of the weight is used to establish an optimization model for the constraint conditions and solve the combined weight of the evaluation index. e detailed description of the method is as follows: Suppose the scheme set is X � (x 1 , x 2 , . . . , x 3 ) and the attribute set is G � (g 1 , g 2 , . . . , g m ). en, y ij � g i (x j ) is the attribute value of scheme x j under attribute g i , r ij is the normalized result of the decision matrix Y � (y ij ) m×n , the weight vector of the attribute is ω � (ω 1 , ω 2 , . . . , ω n ) τ , and ω i ≥ 0, n i�1 ω i � 1. us, the comprehensive attribute value of scheme x j can be calculated: Attribute X represents the total variance of all decision-making plans and other decision-making plans, and the weight vector X should be selected to maximize the total variance of all the attributes for all decisionmaking plans, so that the deviation function can be constructed: us, by solving the following single-objective optimization problem model, the weight vector of the attribute can be calculated:   Computational Intelligence and Neuroscience In this paper, the level difference maximization is described as . . , n as the new weight, After the above steps, the subjective weight and objective weight of classifier k are used to obtain the combined weight by level difference maximization. Finally, it is judged whether the value range of the combined weight is satisfied. If the combined weight meets the condition, it will be retained; otherwise, it will be regularized.

Ensemble Learning Model Based on the ER Rule
e method of setting the weight of evidence is defined in Section 3, the ensemble learning model based on ER rule is built in this section. When using the ER rule to combine classifiers in ensemble learning, each classifier obtains its classification results for the datasets, treats the classifiers as evidence in the ER rule, and calculates the weight and reliability of each piece of evidence. Decisions are made through the ER rule, and the classification results are calculated. is process is shown in Figure 3.

Setting the Reliability of Evidence.
e reliability of evidence is an inherent characteristic of evidence, reflecting the ability of the evidence to provide correct assessments or solutions to hypotheses [30]. Based on the definition of evidence reliability, when using the ER rule for ensemble learning, each classifier in the process is regarded as independent evidence, and the classification accuracy of each classifier for the dataset is the ability to correctly evaluate the sample. erefore, through mathematical statistics, the probability that a certain classifier is correctly classified for the dataset is calculated as its reliability, which is recorded as r k .

Setting the Weight of Evidence.
e weight of evidence is usually set by the subjective weighting method or objective weighting method, the specific steps of the weighting method are described in Section 3. is paper combines the two methods and proposes to use the combination weighting method to set weights for evidence. e process of assigning weight of evidence is shown in Figure 4. e AHP method in the subjective weighting method is combined with the EW method, COV method, and CRITIC method in the objective weighting method, and the results of the weighting are compared.

Evidential Reasoning Process.
Assuming that each classifier in the ensemble process is an independent piece of evidence, there are k evidence in total. e category is considered as the evaluation level and the probability of the classifier's judgement on the sample category is considered as the belief level corresponding to the evaluation level. e reliability distribution of each piece of evidence can be expressed as where θ is the evaluation level, p θ,k is the belief degree of the evaluation scheme evaluated as evaluation level θ under evidence e k , Θ is the discernment framework including all evaluation levels, p Θ,k is the belief degree of indicator k relative to discernment framework Θ, namely, global ignorance, and p Θ,k satisfies 0 ≤ p θ,k ≤ 1, θ∈Θ p θ,k ≤ 1. e reliability r k and weight ω k of the evidence have been determined by Section 4.1 and Section 3, respectively. e weighted belief distribution of the evidence k with reliability is m k � θ, m θ,k , ∀θ ⊆ Θ; P(Θ), m P(Θ),k , where P(Θ) is a power set and m θ,k is the mixed probability quality of indicator k under level θ and satisfies θ∈Θ m θ,k + m P(Θ),k � 1, where c rw,k � 1/(1 + ω k − r k ) is the regularization coefficient. m θ,k is the basic probability quality of indicator k under level θ, and ∅ is an empty set. Each classifier serves as an independent piece of evidence, and there are K pieces of evidence, so the combined belief P θ,e(b) of θ is determined by the following equations for b pieces of evidence: , θ ⊆ Θ, θ ≠ ∅, where m θ,e(b) is the degree of support for hypothesis θ of the combined result of the fusion of b pieces of independent evidence, and b � 1, . . . , K, m θ,e(1) � m θ,1 , and m P(Θ),e(1) � m P(Θ),1 .
Supposing the utility of the evaluation level θ is u(θ), the expected utility u of the final model is Comparing the expected utility u with the utility u(θ) of the evaluation level, if u � u(θ), the result of the integration using the ER rule is the category θ.

Case Study
ere are five types of datasets used in this study: part of a large fish dataset provided by the Kaggle platform, part of a flower recognition dataset, part of a fruit 360 dataset, Stanford's dog dataset, and part of the dataset from the public weather data package provided by the Baidu AI Studio platform. e specific parameter information of these five datasets is shown in Table 1.
e study steps are as follows: (1) First, we use five commonly used deep learning algorithms as the classifier, i.e., DenseNet121, Mobi-leNetV2, InceptionV3, EfficintNet, and ResNet152V2, and use 60% of the samples in the dataset as the training set and all the samples as the test set for classification prediction.
ree deep learning algorithms are randomly selected for integration from the five deep learning algorithms, with the number of samples as rows, and the combination of categories and deep learning algorithms as columns, forming a matrix of rowNum×12, where rowNum is the total number of samples in different datasets.
(2) Second, we calculate the accuracy of the classifier for sample prediction as the reliability of the classifier. Taking the probability matrix as the original evaluation matrix, the subjective weights and the objective weights are calculated by the AHP, EW, COV, and CRITIC methods. ese weights are combined by the addition, the multiplication and the level difference maximization to obtain the combined weight of the classifier. (3) ird, we take the classifier as the evidence, the category as the evaluation level of the evidence, and the probability predicted by different classifiers as the belief distribution. e ER rule is used to integrate the classifiers, the calculated expected utility value is compared with the evaluation level, and the judgement of the image category by the ER rule is calculated. (4) Fourth, the Spearman correlation analysis is used for statistical analysis of the results to determine whether there is a significant difference in performance between the combination weighting method and other weighting methods.
In this paper, the abbreviations D, M, I, E, and R refer to five classifiers DenseNet121, MobileNetV2, InceptionV3, EfficientNet, and ResNet152V2, respectively. Five classifier combinations, DIE, DIR, IER, MIE, and MIR, are used as study objects.

Study Based on Large-Scale Fish Dataset.
e dataset is used in this study is part of the large fish dataset on the Kaggle platform. For each picture in the dataset, five classifiers are combined with the ER rule for ensemble learning, subjective and objective weighting methods are used to calculate weights and combined, weight regularization is considered, and the final accuracy is compared. e comparison results are shown in Table 2 and Figure 5. In Table 2, "+" means using the additive synthesis method for subjective and objective weights, "(+)" means performing regularization processing on the weights after using the additive synthesis method to combine weights, " * " represents the multiplicative synthesis method, "( * )" represents the regularization of the weights after combining the weights using the multiplicative synthesis method, and "(rg)" represents the weight by the level difference maximization method. Table 2, the different methods used to determine weights have their own advantages and disadvantages in an ensemble. For example, when the three classifiers of DIR are combined for ensembles, the weight determined by the AHP method is the best, and the accuracy reaches 94.75%.

As shown in
Among the three combination methods of DIE, IER, and MIE, the weights determined by the multiplicative synthesis method are the best, and the highest accuracies of the three combination methods are 96.55%, 96.05%, and 96.28%, respectively. ese five combinations are generally better than using   subjective weighting and objective weighting alone to determine the weights by the level difference maximization method. In most cases, the combination weighting effect is significantly better. e bold values represent the maximum accuracy for each column of data. As shown in Figure 5, in the MIR combination, the weights determined by the level difference maximization are generally better than the effect of using subjective weighting and objective weighting alone. After the weights determined by the additive synthesis method and the multiplicative synthesis method are regularized, all effects are better than those without regularization, and the accuracy is approximately 0.1% higher. For the other four combinations, the effect of regularization on the weights determined by the additive synthesis method is better than the effect without regularization, which is approximately 0.1%-0.3% higher. In addition, the level difference maximization method determines the weights close to the accuracy of the regularized additive synthesis method to determine the weights.

Study Based on Weather Dataset.
e dataset used in this study is part of the weather dataset on the Kaggle platform. For the images in the dataset, five classifiers and the ER rule are combined for ensemble learning, subjective and objective weighting methods and combination weighting methods are used to calculate weights, and the ER rule ensemble accuracies of different weight determination methods are compared. e comparison result is shown in Figure 6.
As shown in Figure 6, the multiplicative synthesis method of the four combinations of DIE, DIR, IER, and MIR has the best effect in determining the weights, and the accuracy is 97.30%-98.74%. e effect after regularization is better than the effect without regularization, and the accuracy is approximately 0.03%-0.11% higher. Sometimes the multiplicative synthesis method determines that the effect of the unregularized weight is better than the effect of the regularization, but in general, the effect of regularization is still better than the effect of nonregularization.

Study Based on the Flower Dataset.
e dataset used in this study is part of the flower dataset on the Kaggle platform, and the content is pictures of various types of flowers. Five classifiers and the ER rule are combined for ensemble learning, and the ER rule ensemble accuracy of different weight determination methods is compared. e comparison result is shown in Figure 7.
As shown in Figure 7, under these five combinations, the effect of determining the weight of the level difference maximization method is generally better than other methods. e DIE combination has the best weight determination effect under the AHP method, with an accuracy of 83.76%. In addition, the regularized weight effect under this combination is better than the unregularized effect, the accuracy is increased by approximately 0.1%. e MIE combination has the best effect under the CRITIC method, with an accuracy rate of 85.20%. e effect after weight regularization is better than the effect without regularization, and the accuracy rate is approximately 0.03%-0.3% higher.
Under the combination of MIR, the accuracy of additive synthesis and multiplicative synthesis to determine the regularized weights is approximately 0.05%-0.31% higher than that of nonregularization.

Study Based on the Fruit Dataset.
e dataset used in this study is part of the fruit 360 dataset on the Kaggle platform, and the content is pictures of different types of apples. For each picture, five classifiers and the ER rule are combined for ensemble learning, and the ER rule ensemble accuracies of different weight determination methods are compared. e comparison result is shown in Figure 8.
As shown in Figure 8, for the DIR combination, the effect of the level difference maximization method to determine the weight is generally better than other methods, and the additive synthesis method to determine the weight of the regularization effect is generally better than the effect of the nonregularization, and the accuracy is improved by approximately 0.06%-0.15%, the accuracy of the regularized weights determined by the multiplicative synthesis method is approximately 0.15%-0.36% higher than the accuracy of the unregularized. In the IER combination, the effect of weight regularization is generally better than that of nonregularization. But for the combination of DIE and MIE, there are three break points in Figure 7, because the weights determined by the additive synthesis method for the subjective and objective weights are greater than 1, which exceeds the range of weight of evidence in the ER rule, and the accuracy of the ensemble cannot be calculated. Compared with other points, the effect of regularization on the weight is generally better than the effect of nonregularization.

Study Based on the Dog Dataset.
e dataset used in this study is part of Stanford's dog dataset on the Kaggle platform, and the content is pictures of different kinds of dogs. For each picture, five classifiers and the ER rule are combined for ensemble learning, and the ER rule ensemble accuracies of different weight determination methods are compared. e comparison results are shown in Table 3.
As shown in Table 3, for the DIE combination, the level difference maximization method determines that the weighting effect is the best. Under the IER combination, the EW method, COV method, and EW (rg) AHP method determine the best weighting effect. In the MIE combination, the AHP method, CRITIC(+)AHP method, and CRITIC(+)AHP method have the best effect in determining the weight. In the MIR combination, the COV method has the best effect in determining the weight. However, for the four combinations of DIE, IER, MIE, and MIR, the range of weights determined when the additive synthesis method is used exceeds 1, which does not meet the range of weight of evidence in the ER rule, and the accuracy of the integration cannot be calculated, so replace it with "#". In the DIR combination, the AHP method is used to determine the subjective weight. Since the final test coefficient CR is greater than 0.1, which does not meet the consistency test, the subjective weight cannot be calculated, and the combination weight cannot be calculated subsequently, so it is replaced with "&". It can be concluded that the weights Computational Intelligence and Neuroscience determined by the combination weighting method are not necessarily better than the weights determined by the subjective weighting method or the objective weighting method alone in all cases. e appropriate weighting method should be considered according to the actual situation.
e bold values represent the maximum accuracy for each column of data.
According to the above cases, we can obtain the following conclusions: (1) First, in most cases, the combined weight determined by the level difference maximization has a better effect than the individual subjective and objective weights. e additive combination method and the multiplicative combination method also have a better effect. Compared with the individual subjective and objective weighting method, the combination weighting method has a significant improvement in the effect of the ensemble learning based on the ER rule. e accuracy of ensemble learning is improved by approximately 0.1%-0.4% through the use of combination weighting method in general. erefore, using the combination weighting method to set the evidence weight of the ER rule has a better effect. Although the combination weighting method is effective, but additive combination method and the multiplicative combination method lack strict theoretical reasoning, so it is recommended to use the more theoretical method of the level difference maximization to combine the weights. e level difference maximization method is used as a combination weighting method for statistical analysis and comparative studies with other weighting methods in 5.2 and 5.3 Subsections. (2) Second, the effect of regularizing the weights on ensemble learning is generally better than that of nonregularization. e analysis of the level difference maximization method and the supporting evidence of the additive synthesis method and the multiplicative synthesis method show that the improvement of the ensemble learning accuracy of the regularization weight is approximately 0.01%-0.3% higher than that of the nonregularization weight. e regularization of weights can also prevent the combined weight from exceeding the range of the weight of evidence in the ER rule. In summary, the weights of evidence need to be regularized.

Statistical Analysis.
To show the differences in performance between the combination weighting method and the other weighting methods and verify the universality and effectiveness of the combination weighting method, we performed a statistical analysis between the different weighting methods. e EW method, COV method, CRITIC method and combination weighting method including EW(rg)AHP, COV(rg)AHP, CRITIC(rg)AHP are tested for significance, and are denoted as "EW/EW(rg)AHP", "COV/COV(rg) AHP", "CRITIC/CRITIC(rg)AHP". e AHP method and the best combination weighting method are tested for significance, and are denoted as "max(rg)".
First, on the basis of the significance level parameter α � 0 · 05, statistical analysis is performed for each method combination, and Spearman rank correlation coefficient is calculated. Second, comparing the Spearman rank correlation coefficient (RS) [31] with the Spearman rank correlation critical value (CV), if RS > CV proves that the differences in performance between the combination weighting method and the other methods is statistically significant. In the DIR combination in the dog dataset, the test coefficient CR of the AHP method is greater than 0.1 and the weight cannot be determined, and the combination is not analysed. erefore, the number of combinations of this dataset is n � 4, then CV � 1; the number of combinations of other datasets is n � 5, then CV � 0 · 9. e comparison of significance levels is shown in Figure 9. As shown in Figure 9, all the RS values are higher than CV in the weather dataset, flower dataset and fruit dataset; only the RS value of max(rg) is lower than CV in the fish dataset; and the RS values of EW/EW(rg)AHP and max(rg) are lower than CV in the dog dataset. It indicates differences in performance between the combination weighting method and the other methods is statistically significant.
To better judge that there are significant differences in performance between different methods, we also calculated p -values. e p -values are compared with the significance level, if p < 0.05 proves the differences in performance are statistically significant. Comparison of p values are shown in Table 4.
As shown in Table 4, most p values are less than 0.05, only the p values are greater than 0.05 in the dog dataset. erefore, the differences in performance between the combination weighting method and other weighting methods is statistically significant.

Comparative Studies.
To demonstrate the effectiveness of the combination weighting method in this paper, comparative studies are conducted in this subsection. e combination weighting method not only considers the subjective intention of attribute decision-makers, but also contains the mathematical theoretical basis for objective weighting, so more reasonable weights of evidence can be determined. To further illustrate the universality and effectiveness of the above method, the order relation analysis method (G1) and the entropy weight method based on the rank sum ratio (RSR-EW) are selected as comparison methods [6,12]. e level difference maximization methods are selected from the combination weighting method, including EW(rg) AHP, COV(rg)AHP, and CRITIC(rg)AHP. e accuracy of the ensemble learning model using the level difference maximization and the accuracy of the ensemble learning model using the comparative method are compared. e results of the comparison studies are shown in Table 5.  As shown in Table 5, the accuracy of the ensemble learning model using the level difference maximization method is higher than that of the ensemble learning model using the comparison method in most cases. Only the comparison method G1 is better in the fruit dataset. erefore, it is reasonable and effective to use the combination weighting method to determine the weight of evidence, which can set a reasonable weight of evidence for the ensemble learning model based on the ER rule, thereby significantly improving the accuracy of the model.

Conclusion
In the field of ensemble learning based on the ER rule, problems include, a small number of classifiers in ensemble learning, difficulty in the quality judgement classifiers based on expert experience, objective weighting method that is too dependent on samples, and difficulty in setting a reasonable weight of evidence to improve the accuracy of ensemble learning. e combination weighting method is proposed to solve these problems. is method considers the subjective judgement of expert experience and the objective information of sample data, which overcomes the subjectivity or objectivity of the weight of evidence to a certain extent, making the weight of evidence more reasonable and significantly improve the recognition accuracy of the ensemble learning model. On this basis, the weight of evidence is regularized, which further improves the effect of ensemble learning model based on the ER rule.
ere are three innovations in this paper. Firstly, in view of the problem that it is difficult to set a reasonable weight of evidence in traditional weighting methods, the combination weighting method is proposed to set more reasonable weights of evidence for the ER rule. Secondly, considering the lack of labels of the ER rule in the information fusion process, the weight of evidence is regularized. Lastly, based on setting the weight of evidence for the ER rule by using combination weighting method, the ensemble learning model based on the ER rule is constructed to improve the effect of integration.
Based on the above research, the main work in the future will include the following aspects: (1) In terms of the execution time of methods, the subjective weighting method is approximately 0.0003s-0.002s, the objective weighting method is approximately 0.03s-0.08s, and the combination weighting method is approximately 0.18s-0.32s, the execution time of the combination weighting method is longer. erefore, how to reduce the execution time under the premise of ensuring the effect is the focus of future work. (2) is paper only studies the combination of several classic subjective weighting methods and objective weighting methods, and there are more combinations of combination weighting methods that can be used, such as game theory.

Data Availability
e study data in this paper come from the image classification datasets of the Kaggle platform and Baidu AI Studio platform. For the source URL of the datasets, please visit: (1) Large-scale Fish dataset: https://www.kaggle.com/crowww/ a-large-scale-fish-dataset. (

Authors' Contributions
YunYi Zhang and Cong Xu contributed equally to this work.