An Effective Soft-Sensor Method Based on Belief-Rule-Base and Differential Evolution for Tipping Paper Permeability Measurement

,


Introduction
Cigarette smoke contains tar, nicotine, and other harmful substances.The research on the technology in reducing the tar and harmful components of cigarette smoke has been the important research and application field in the tobacco industry since the 1970s.Now many countries have made strict restrictions on the content of tar, nicotine, and CO in commercial cigarettes.There are some existing tar reduction techniques, such as the use of tobacco with low tar, the improvement of tobacco formula and cigarette structure, the addition of expanded tobacco, and the application of tipping paper.Since the perforations on tipping paper are helpful for introducing air into cigarette smoke to dilute tar, the use of perforated tipping paper to package cigarette butt has become one of the most common methods for reducing tar.Considering the air permeability of tipping paper is directly related to the dilution of smoke and have greatly influenced on the cigarette taste, the tobacco industry must strictly control the tipping paper permeability [1,2].As people's requisition for health and cigarette quality is increasing constantly, the permeability measuring technology is becoming more and more important and valuable.
The tipping paper permeability is a measurement of the amount of air per unit of time that flows through the vertical given area of tipping paper under a certain pressure condition, whose unit is CU (1CU = cm 3 /min ⋅ cm 2 ⋅ kPa).A representative of existing detection devices is BASTAN Company's online air permeability tester.It utilizes the conversion of the photoelectric signal to obtain a measurement value of permeability.Nevertheless, the existing online testing equipments are very expensive and complicated to repair, which have become a burden on enterprise.Therefore, the development of a more economical, simple, and effective detection system is necessary for enterprise.
Recently, with the technological advances in CCD sensors and integrated semiconductors, industrial machine vision cameras can deliver great features and functionality at outstanding price-performance ratios.In fact, the sizes of the perforations on tipping paper directly affect the permeability.
In addition, the surface roughness and the thickness of tipping paper also have some influence on the permeability.These factors can be utilized to build a model for measuring it.Figure 1 shows a cigarette with a wrapping of perforated tipping paper on its filter, where the perforations have small diameters (30-90 μm).The gray image of the corresponding tipping paper is given in Figure 2. The light points in this image are the perforations produced by a laser-beam drilling machine.The area of perforations can be calculated by using image processing technique, and the surface roughness as well as the thickness of tipping paper can be reflected by the gray value of tipping paper's image.Thus, we are motivated to design a permeability measuring device based on machine vision system.That is, an industrial camera with microlens can be utilized to capture the image of tipping paper with microperforations in a certain small region, and a soft-sensor model can be built between the intermediate variables and the tipping paper permeability.Since a highquality camera and other hardwares are easy to buy directly from manufacturers, the key issue for this new device is how to develop an effective soft-sensor method.
BRB is a kind of expert system in nature, which can effectively use various types of information to establish the nonlinear model between input and output.The belief-rulebase concept and its inference methodology were proposed by Yang et al. [3], which are based on the evidential reasoning approach (ER) [4].Compared with traditional rule-based systems, a BRB system can not only be used to analyze the decision problems by handling both quantitative data and qualitative information with uncertainties but also can capture more complicated nonlinear causal relationships between antecedent attributes and consequents.By now, it has been successfully used in the safety analysis of offshore systems [5,6], leakage detection of oil pipeline [7,8], product life assessment [4], evaluation of gastric cancer [9][10][11], hidden behavior prediction [12], safety assessment for complex system [13,14], and information fusion [15].Due to BRB's wide applications in a variety of fields, the current paper presents a novel soft-sensor method based on BRB for measuring the permeability of tipping paper.
In the traditional methods, BRB systems are built by the decision maker according to the experience and knowledge or other original model.However, for complex systems, it is often unable to use the traditional methods to establish an accurate analytical model.A BRB represents functional mappings between inputs and outputs.Compared with the traditional IF-THEN rule, it provides a more informative and realistic knowledge expression.In BRB, the input data, the belief structure, the attribute weight, the rule weight, the activation weight, and all the activation belief rules are combined to generate the corresponding outputs.Thus, it can be seen that these BRB parameters are important factors to affect the performance of the BRB system.In the traditional methods, the experts set the parameters of BRB according to their experience and knowledge.However, for the complexity and uncertainty of many decision-making problems, in reality, it is often unable to use the traditional method to establish an accurate model, which limits the ability of BRB to simulate the actual system.Therefore, in order to improve the modeling precision of a complex system, it is necessary to train and optimize the parameters of BRB.
The parameters' optimization or training for BRB is a nonlinear constrained optimization problem (NCOP), whose objective function is a complex nonlinear formula.From equations ( 4)- (11), it can be seen that the objective function is a multivariate compound one with strong nonconvexity.Since there is still no general method for obtaining global optimum of NCOP, the NCOP of BRB is usually solved by using an effective gradient-based solver (i.e., f mincon) in MATLAB.Yang et al. [16] first proposed the generic BRB learning model and a parameter optimization method by using f mincon.An essential feature of f mincon is sensitively dependent on initial conditions, which determines that f mi ncon is applicable to small-scale optimization problem but is not suitable for solving complex problems.After that, Zhou et al. [17] proposed an online update algorithm based on expectation maximization estimation algorithm, but the effectiveness of the algorithm is related to the completeness of the data and the set of referenced values.Subsequently, Chen et al. [18] added the premise attribute as the new training parameters and used f mincon to optimize the system.Chang and Zhang [19] developed a self-learning parameter training algorithm based on the optimization step   Complexity and gradient method.The accuracy of BRB is improved, but the algorithm has the defects of large implementation complexity and low convergence.Chang et al. [20] first used standard differential evolution algorithm to train the parameter learning model, which can effectively improve the efficiency and precision of BRB.However, all the above papers focus on the traditional case of BRB with predetermined referenced values of the input and output attributes.
From an optimization point of view, the more adjustable parameters or decision variables the BRB model has the larger area of feasible solutions it owns.That is, the BRB model with more decision variables is helpful in finding a more reasonable model by using optimization methods.Moreover, in many real-life applications, not all the expert knowledge is explicit, which means the overuse of it may decrease the accuracy of BRB.Thus, how to model an effective BRB with less expert knowledge or predetermined parameters is an important issue.Therefore, this paper adds the referenced values of input and output attributes as the optimization parameters of the BRB system.Obviously, this new BRB model is more flexible than the traditional one and the latter is just a subset of the former.Traditional mathematical approaches are usually applied to optimize the parameters of BRB, which are guaranteed to converge only under certain convexity assumptions and are sensitive to the initial solution.Unfortunately, the BRB model is nonconvex and has more than one local optimal solution.That is to say, mathematical approaches are limited to solving its simple and small-scale cases.When the referenced values of both input and output attributes are added as the optimization parameters, the NCOP of BRB becomes more complex and its feasible solution region is enlarged.Under such circumstances, it is difficult for the traditional approaches to obtain high-quality solutions within a reasonable time.Recently, some intelligent algorithms have been proposed to deal with different complex optimization problems, such as differential evolutionary (DE) algorithm and estimation of distribution algorithm (EDA) for flow shop scheduling problems [21][22][23][24], particle swarm optimization (PSO) algorithm for multiobjective optimization problem [25], DE and PSO for parametric identification problem [26], and evolutionary algorithm (EC) and genetic algorithm (GA) for interval multiobjective optimization problems [27,28].These algorithms utilize their own evolutionary mechanisms to perform an effective search in complex solution space and obtain very satisfactory results.Among them, differential evolutionary (DE) algorithm is one of the most successful evolutionary optimization techniques, which uses a simple operator to create new candidate solutions and a one-to-one competition scheme to select the new candidate [29].Although having some advantages in finding the global or near-global optimum, the local search ability of DE is relatively weak and sometimes the searching process of DE may be trapped in local minima [30].NM simplex method (NMS) was first introduced by Nelder and Mead [31] and is one of the most widely used local direct search methods for nonlinear unconstrained optimization.NMS is a derivative-free line search method that was specially designed for solving traditional unconstrained problems of minimization type, like nonlinear least-squares problem, non-linear simultaneous equations, and function minimization.It has been successfully applied to the low dimensional unconstrained optimization problem [32].Thus, in this paper, an enhanced DE is proposed for the NCOP of extended BRB.The contribution of this paper can be outlined in the following aspects: ( The structure of this paper is as follows.In the second part, BRB is described.The third part introduces the optimization problem and the proposed MNMSDE algorithm.The fourth part is the experimental analysis.Finally, the work in this paper is summarized in the last part.

Problem Description
2.1.Belief Rule Base.A BRB consists of a series of belief rules, and the expression of the kth rule is as follows: with a rule weight θ k and attribute weights δ 1 , δ 2 , … , δ M , where R k k = 1, 2, ⋯, L denotes the kth rule of BRB.Each rule is provided with a rule weight θ k , which indicates the importance of it relative to other rules.
represents the referenced values of the prerequisite attribute x i of the kth rule.M represents the number of prerequisite attributes.L represents the number of belief rules in BRB.δ i i = 1, 2, ⋯, M represents the weight of x i , which reflects the importance of x i relative to the other premises.D j j = 1, 2, ⋯, N is the referenced value of the output in the kth rule.β j,k j = 1, 2, ⋯, N, k = 1, 2, ⋯, L is the belief degree j assessed to D j .

The Reasoning Mechanism of BRB.
Belief-rule-base inference methodology is based on ER.ER can be directly applied to combine activated belief rules and generate final conclusions as follows.First, transform the inputs into the probability mass corresponding to the referenced values and generate a series of 3 Complexity all activated rules to generate the combined degree of belief in each possible consequent D j by ER.Finally, the output S x is generated by aggregating all D j and β j,k , which can be represented as follows: Detail reasoning process is as follows.
(1) Transform Input.Using the information transformation technology, the matching degrees of the inputs relative to each reference level can be described as where α i,j is the belief degree assessed to input x i .The calculation method is as follows: (2) Calculate the Activation Weight.w k is the activation weight of A k , which measures the degree to which the kth rule is weighted and activated.The calculation formulas of the activation weight of the kth rule can be calculated by the following formula: where θ k k = 1, 2, ⋯, L is the relative weight of the kth rule, and δ i i = 1, 2, ⋯, M represents the ith antecedent attribute weight of the kth rule.If w k >0, the kth rule is activated.δ i,k is generally set to 1.
(3) Aggregate the Belief Rules Using the Evidential Reasoning Approach.Using ER to combine all the rules and the following output S x can be obtained as follows: where w k is calculated by equation (7), and β j is the belief degrees relative to the evaluation result D j .
(4) Calculate the Expected Utility.Assuming the utility of evaluation result D j is u D j , then the expected utility of S x i can be calculated by the following formula:

Optimization Model and Algorithm
3.1.Optimization Model of BRB.
are the adjustable parameters in BRB.The objective of the training is to minimize ξ P by adjusting the parameters P. δ k,j is 1 in this paper.The processes of BRB system parameter optimization are shown in Figure 3.The optimization model for training BRB parameters is defined as follows: In order to obtain a more stable and effective BRB, some related optimization methods have been put forward [5,8,16], mainly for the optimization of parameters and structures.Predecessors had taken the optimization of the belief degree, the prerequisite attribute weight, and the rule weight into account, but the referenced values are usually given by experts according to their experience.Each belief rule is obtained by adding the belief degrees, the prerequisite attribute weights and the rule weights to a traditional IF-THEN rule, whose parameters are largely influenced Complexity by their corresponding referenced values.This means the relatively reasonable referenced values are conductive to build a more accurate BRB model.Therefore, in this paper, we consider adding all referenced values as the parameters to be optimized.
Since the hole area and the average gray value in an image largely determine the perforated tipping paper permeability, we chose these two factors as inputs and the air permeability as output to establish a BRB.Denote A 1 = A 1,p1 , A 1,p2 , ⋯, A 1,J 1 as the reference level of the hole area, A 2 = A 2,q1 , A 2,q2 , ⋯, A 2,J 2 the reference level of gray value, and Here is an example of the evaluation level of the output.D 1 and D N represent the lower and upper bounds of each referenced value of the output, respectively.When the referenced values are trained, the new referenced values are generated in the referenced value interval as follows: where D = D 1 , D 2 , ⋯, D N−1 , D N and the referenced values are arranged in sequence according to the order of small to large.The column vector of the parameters to be optimized is as follows: In (11), ξ P = 1/T ∑ T m=1 y m − ŷm 2 , and ŷm represents the expected output utility of the BRB system.The smaller the ξ P is, the more accurate the BRB system can simulate the actual system.According to [4], the definitions of constraints are given as follows: (1) The rule weight is changed between 1 and 0, i.e., (2) The belief degree shall not be greater than 1 or less than 0, i.e., (3) If the rule is complete, the rule outputs part of confidence is equal to 1; otherwise, it is less than 1, i.e., (4) A i,j represents the jth referenced value of x i , and it should be in the range of input data, i.e., where l i represents the lower limit of the input range and u i represents the upper limit of the input range.
(5) D j represents the jth referenced value of y m , and it should be in the range of output data, i.e., where l represents the lower limit of the output range and u represents the upper limit of the output range.
The three core operations of DE are as follows.
Mutation operation: DE creates a mutant vector t gen i by employing the following mutation operation: where r 1 , r 2 , r 3 ∈ 1, 2, ⋯, NP is randomly selected, and i ≠ r 1 ≠ r 2 ≠ r 3 .F is the scaling factor that is usually less than 1.

Complexity
Crossover operation: after mutation, the crossover operation is applied to replace partial variables of x gen i with the corresponding variables of t gen i to generate a trial vector v gen i by the following formula:

21
where rand j is the jth evaluation of a random number uniformly distributed in the range of [0, 1], rand N i is a randomly chosen index from the set 1, 2, … , N , and cr ∈ 0, 1 is the crossover probability.
Selection actions: the greedy selection method is used to select the better individual between v gen i and x gen i for creating the next generation, which is shown as follows: Adaptive Meta-Lamarckian Learning Strategy.In the recent years, methods for automated tuning of mutation strategies attract increasing attention [33][34][35], it has been verified that the application of the multimutation strategies can enhance the performance of DE.Ong and Keane [36] proposed an adaptive Meta-Lamarckian learning strategy to dynamically decide which neighborhood to be chosen for local search, which is conducive to improve search performance and reduce the probability of utilizing an inappropriate local search method.This learning strategy widely is used in local search [37,38].Inspired by the work of Ong and Keane, we dynamically determine the selected probability of each mutation strategy, which can gradually evolve the most suitable learning strategy at different learning stages.That is, we select three mutation strategies as candidates: "DE/best/ 1", "DE/rand/1" and "DE/rand to best/1".In the initial stage of evolution, the selected probability p str t t = 1, 2, 3 of each mutation strategy str t t = 1, 2, 3 is set to the same value.As evolution continues, p str t is updated at every K generations.
In each generation, the mutation strategy adopted by each trial individual is determined by the roulette wheel rule.
Denote NP the population size, sum str t gen the total times of str t be selected at generation gen, suc str t gen the total success times of str t at generation gen, and p suc str t the success probability of str t .Summing up the above symbols, we present the procedure of calculating p str t as follows: Step 1. Set gen = 0, sum str t gen = 0, suc str t gen = 0, and p str t = 1/3 for t = 1, 2, 3; Step 2. Set gen = gen + 1 and i = 0; Step 3. Set i = i + 1 and apply a roulette wheel rule based on p str t to select one strategy str for the individual x gen i ; Step 4. If str = str 1 , then sum str 1 gen = sum str Step 5.If i < NP, then go to Step 3; Step 6.If gen mod K = 0, then p suc str t = suc str t gen /sum str t gen for t = 1, 2, 3, p str t = p suc str t /∑ 3 l=1 p suc str l for t = 1, 2, 3, and sum str t gen = 0, suc str t gen = 0; Step 7. If p str t < ρ t = 1, 2, 3 , then p str t = ρ; Step 8.If gen < the maximum generation, then go to Step 2.
Step 7 is used to prevent a mutation strategy from being completely eliminated in the process of evolution.Through a series of geometric operations (i.e., reflection, expansion, contraction, and reduction), a new vertex can be created.After each geometric operation, the current worst vertex is replaced by the new vertex if the latter is better than the former.With the help of these geometric operations, the simplex can improve itself and come closer to the optimum.The steps of the simplex method are concluded as below: Step 1. Set k = 1 and v Step 2. Sort and renumber all the individuals by f v Step 6. Shrink: calculate new vertices new v gen i as follows: where δ is the collapse coefficient between 0~1; Step 1. Set NMS size = NMS Popsize = NP/3; Step 2. Reindex the individuals in the new population in the increasing order of their fitness values; Step 3. Select the worst NP/3 individuals as the vertexes to perform the NM simplex method in Subsection 3.For the purpose of increasing the diversity of the population and preventing the algorithm from falling into local optimum, we add a perturbation operation in the algorithm.That is, at the beginning of each generation, the standard deviation of fitness function values is calculated as a criterion for judging the degree of individual differences.If the standard deviation is less than the set value σ, randomly select half individuals of the population and update them by the following formula: where rand P i is a randomly chosen index from the set 1, 2, … , NP , j represents the index of the selected individuals, sel x gen j is a randomly selected individual, and g be st x gen is the best individual at generation gen.

Complexity
In addition, when any individual violates the constraints of BRB's corresponding NCOP, its corresponding elements (see Equation ( 13)) should be repaired by the following equations: On the one hand, the local search and global search ability get the balance due to the combination of the geometric search based on NMS and the evolutionary search based on DE.On the other hand, as NMS uses the center of the good individuals in the population to implement geometric search, the local search ability will be enhanced.
The MNMSDE algorithm's visual diagram is shown in Figure 4.

Experimental Study
4.1.BRB Model Construction.The number of referential points of each input and output determines the size of the rule base, a reasonable size of the BRB is conducive to the accuracy of the results.In this paper, for each input and output attributes, 5 referenced values are selected as follows:

Suppose we choose the following referenced values:
A 1 ∈ 0, 1 5, 2 5, 3 5, 4 75 , 41 D ∈ 245, 650, 1800, 2800, 4500 43 According to the above-referenced values, the prerequisite attribute of input of each rule is calculated by formulas (4)-( 6), and the belief degree of its corresponding input and output is given by the expert.Then the tipping paper permeability BRB detection model can be constructed.The rule base is shown in Table 1.As there are 25 belief rules in BRB and 5 referenced values for the two inputs and the corresponding output, the total number of optimization parameters is 159.Precisely speaking, the optimization parameters include the rule weight θ k , the belief degree β j,k , the referenced values of input attribute A k i , and the output attribute D k .

Experimental Results and Comparisons.
In order to verify the effectiveness of the proposed model and method, we used 1000 data samples, in which 800 samples are randomly selected for training and the rest of the samples are used for testing.The training and testing samples are collected from a Chinese cigarette factory.Each sample includes the hole area, the average gray value, and the corresponding permeability on a unit area.The former two are the input values, and they are calculated by using the image captured with an industrial camera of our designed device.The last one is the output value, which is obtained via traditional instruments.All compared algorithms are coded in MATLAB, and experiments are executed on an Intel 2.6GHz PC with 8GB RAM.To make a fair comparison, each algorithm is independently run 30 times under the same evaluation times.
For the purpose of demonstrating the effectiveness of MNMSDE, we carry out some comparisons with f mincon [16] and standard DE [20].The mutation strategy used in   2 and 3, the optimized model parameters in Table 1 and (41)~(43) are displayed in the attachment.Figures 5 and 6 show the permeability detection or measurement results of original BRB and MNMSDE-BRB, respectively.From Figures 5 and 6, it is shown that MNMSDE-BRB has better fitting and predicted precision than original BRB.That is to say, the special designed DE (i.e., MNMSDE) with the new BRB model has the ability to build a more reasonable model for the considered problem.
The relative errors are listed in Table 2. From Table 2, it can be seen that the new BRBs with referenced values optimized perform better than the traditional BRBs (i.e., TBRBs).This manifests the validity and rationality of adding the referenced values of both input and output attributes as the optimization parameters of BRB.As for DE-BRB, MNMS DE-BRB, MNMSDE_V1-BRB, and MNMSDE_V2-BRB, the sequence (DE-BRB, MNMSDE_V1-BRB, MNMSDE_   In order to reflect the dispersion degree of the testing samples and the corresponding estimated permeability values, the RSD, ASD, and RMSE values of the compared methods are given in Table 3.The smaller the RSD/ASD/RMSE is, the more stable the soft-sensor method is.So, it is concluded from Table 3 that MNMSDE-BRB is a robust method.
To sum up, MNMSDE-BRB is an effective and robust method for tipping paper permeability measurement.

Conclusion
With a view to modelling an effective BRB with less expert knowledge for the tipping paper permeability measurement problem, this paper proposes a novel soft-sensor method based on BRB and DE.Simulations and comparisons based on real data of a China tobacco factory demonstrate the effectiveness and robustness of the proposed method.11 Complexity There are two main features of the proposed method.First, in order to improve the reasoning or fitting ability of the traditional BRB model and reduce the dependence on the expert knowledge, the referenced values of both input and output attributes are added as the optimization parameters of the traditional one, which can construct a more flexible model for the considered problem.Second, a hybrid DE algorithm, namely, MNMSDE, is designed to find the reasonable parameters of the BRB model in a more complex solution space, which incorporates the NM simplex method, the perturbation operation, and the Meta-Lamarckian learning strategy into a standard DE to enhance its search ability.That is, consider the improvements of both the accuracy of the BRB model and the effectiveness of the optimization algorithm.The proposed MNMSDE-BRB has been used in our designed permeability measuring device.This new device has so far been applied to a famous tobacco factory in China and proved to be quite effective and economical.In our future work, we will generalize MNMSDE-BRB to the diagnosis problem of respiratory disease, which needs to dynamically change the size of the belief rule in accordance with the actual situation and remove the noise in the data.

Figure 1 :
Figure 1: A cigarette with perforated tipping paper.

Figure 2 :
Figure 2: A gray image of perforated tipping paper.

3. 2 .
MNMSDE for BRB Model 3.2.1.Differential Evolution Algorithm.Differential evolution algorithm is an intelligent algorithm that is based on population evolution, and it solves the optimization problem by the cooperation and competition among the individuals in the population.In this paper, the objective function is min f x x = P = x 1 , x 2 , ⋯, x DM .Denote gen as the generation, x gen i the individual at generation gen, and t gen i the mutant vector relative to x gen i .The initial population x i = x gen i,1 , x gen i,2 , ⋯, x gen i,DM , i = 1, 2, ⋯, NP is randomly generated by the following formula:

Figure 3 :
Figure 3: Basic ideas and processes of BRB system parameter optimization.

3. 2 . 3 .
NM Simplex Method.The NM simplex method forms a simplex by setting individuals x gen 1 , x gen 2 , … , x gen NMS size as vertices, where NMS size is the size of the vertices.

Step 3 . 1 ,Step 5 .
Reflection: calculate the reflection point p gen r as follows: p gen r = v gen center + α ⋅ v gen center − v gen NMS size , 26 where the geometric center v gen center = ∑ NMS size-1 i=1 v gen i / NMS si ze − 1 and the reflection coefficient α is greater than 0then go to Step 4; if f itness p gen r ≥ f itness v gen NMS size−1 , then go to Step 5. Step 4. Expansion: generate the expansion point p Outside contraction: if f itness p gen r < f itness v gen NMS size , then produce the external contraction point p p gen c , k = k + 1, 34 and go to Step 7; otherwise, go to Step 6;

Figure 5 :
Figure 5: The testing samples and output of original BRB.