Modelling and Prediction of Particulate Matter , NO x , and Performance of a Diesel Vehicle Engine under Rare Data Using Relevance Vector Machine

Traditionally, the performance maps and emissions of a diesel engine are obtained empirically through many testes on the dynamometers because no exact mathematical engine model exists. In the current literature, many artificial-neural-network(ANN-) based approaches have been developed for diesel engine modelling. However, the drawbacks of ANN would make itself difficult to be put into some practices including multiple local minima, user burden on selection of optimal network structure, large training data size, and overfitting risk. To overcome the drawbacks, this paper proposes to apply one emerging technique, relevance vector machine (RVM), to model the diesel engine, and to predict the emissions and engine performance. With RVM, only a few experimental data sets can train the model due to the property of global optimal solution. In this study, the engine speed, load, and coolant temperature are used as the input parameters, while the brake thermal efficiency, brake-specific fuel consumption, concentrations of nitrogen oxides, and particulate matter are used as the output parameters. Experimental results show the model accuracy is fairly good even the training data is scarce. Moreover, the model accuracy is compared with that using typical ANN. Evaluation results also show that RVM is superior to typical ANN approach.


Introduction
Air pollution is one of the most challenging problems today in many cities.The increased use of motor vehicles causes the amount of exhaust emissions to increase dramatically, which makes the problem more serious.Reducing the exhaust emissions from engines has then become an important concern of governments and motor vehicle manufacturers.Moreover, in view of the increasing oil price and the need to reduce emission of the global warming gas CO 2 , there is a demand to reduce fuel consumption while maintaining the engine performance.Therefore, many researchers have focused on the relations between these two issues, namely, engine performance and emissions.
Diesel engines, though having the advantages of high fuel efficiency and high durability when compared to other engines, are the major source of nitrogen oxides (NO x ) and particulate matter (PM), which are harmful to human health and the environment.In particular, the fine and ultrafine particles (∼10 micrometers or less) emitted by diesel engines can accumulate in the human respiratory system and cause various health problems [1] and influence global climate by absorbing solar radiation and reacting with other atmospheric constituents [2,3].Diesel engines are used extensively in buses and trucks; thus they are the major road-side emitters, posing significant threat to the health of the road users.In order to reduce these emissions, the combustion process of the engines has to be controlled.Additional hardware and instruments must be installed to monitor and control the engine operating parameters.Many experiments and tests must also be conducted to obtain a comprehensive understanding on the performance and emissions of the diesel engine.These are very complicated, time consuming, and expensive [4].
A way to solve these problems is to create a mathematical model for the diesel engine so that all the costly and immeasurable data can be predicted and virtual sensors can be used to replace the costly sensors.However, the combustion process of a diesel engine is too complex that an exact mathematical model still does not exist today.Figure 1 shows one example of diesel engine performance map with only three variables in which the relationship among the engine load (torque), engine speed, and brake specific fuel consumption is already highly nonlinear.It could be imagined that if more variables are studied together, the model will be very complicated and very difficult to obtain.Moreover, the mathematical model varies for different engines.
In general, black-box identification is one of the commonly used modelling techniques suitable for engines because it can manage complex and uncertain information.Many recent researches in black-box identification have described the use of artificial neural network (ANN) for modelling of diesel engine performance [5][6][7][8][9] and emissions [7,8,[10][11][12][13] based on experimental data sets.The ANN has in general, however, three main drawbacks for its learning process [14].
(1) The architecture, including the number of hidden neurons, has to be determined a priori or modified while training by heuristic, which results in a suboptimal network structure.
(2) The training process (i.e., the minimization of the residual squared error cost function) in ANN can easily become stuck in local minima.Various ways of preventing local minima, like early stopping, weight decay, have been employed.However, these methods greatly affect the generalization of the estimated function (i.e., the capacity of handling new input cases).
(3) The amount of training data is usually large.Normally at least 200∼400 sets of training data is re-quired to build an accurate ANN engine model [15].However, the collection of diesel engine emission and performance data is usually time consuming and costly, so the data set is usually lower than 50, resulting in that ANN may not be a good solution for diesel engine modelling.
To overcome the disadvantages of ANN, an algorithm entitled relevance vector machine (RVM) was proposed by Tipping [16].This approach is an emerging machine learning technique that is able to utilize more flexible candidate models, which are typically much sparser, offer probabilistic prediction, and avoid the need to set additional hyperparameters.The other advantage is that the training algorithm of RVM can ensure a global optimal solution whereas the learning process of ANN may cause a local optimal solution, so ANN requires more training data to minimize the risk [14].With this good property, RVM is likely not to require too much sample data to build an accurate model.However, one deficiency of this approach is that the training time is approximately in the cube of the sample numbers.Thankfully, a fast training algorithm [17] is developed for RVM which initializes with an "empty" model, and sequentially "add" samples to increase the marginal likelihood, and also modify their weights.Within the same principal framework, the objective function can also be increased by deleting the samples which subsequently become redundant.
Recently, RVM has been applied to system modelling and predictive control [18][19][20].These researches show that RVM is generally superior to the ANN.Moreover, the application of RVM to modelling of diesel engines under rare data is very few.For these reasons, in the present paper, RVM is employed to model the performance and emission characteristics of NO x and PM of the diesel engine.Experiments are still required to provide sample data for RVM training.To demonstrate the effectiveness of this approach, a neuralnetwork-based diesel engine model is also constructed and compared with the RVM model.

Relevance Vector Machine
The procedure of the RVM modelling is introduced here.Consider a training data set D of N input vectors {X n } N n=1 , along with N corresponding scalar-valued output {y n } N n=1 .The output y n is assumed to contain zero-mean Gaussian noise with variance σ 2 .Hence, the probability of prediction error ε n for y n is a Gaussian distribution of zero mean and variance σ 2 : p(ε n | σ 2 ) = N(0, σ 2 ), with ( That is, where f (X n , w) in ( 1) is the prediction model for the model output, y n , with the input X n and w = [w 1 , . . ., w N ] is the weight vector for the RVM model.
The predicted output y at an input X in the kernel model can be represented by where K(X, X i ) is a basis function and Φ is the In this research, radial basis function (RBF) is chosen as the basis function K because it is commonly used for modelling problems.The approach for estimating y is to maximize the likelihood in The likelihood function in ( 4) is complemented by a prior over the weights w = {w i }, i = 0 to N, to control the complexity of the model function and avoid overfitting.The prior is a zero-mean Gaussian probability distribution and is defined over every weight w i as follows: The hyperparameters vector, α = [α 0 , . . ., α N ] T , controls how far for each weight, w i , is allowed to deviate from zero.Consequently, using Bayes' rule, the posterior over w is given as follows: where p(y | α, σ 2 ) is the normalizing factor.p(y | w, σ 2 ) and p(w | α) are both Gaussian priors.The posterior mean μ and covariance Σ are as follows [17]: where A defines as diag(α 0 , . . ., α N ).In fact, the w in (3) can be set to the fixed μ for the purpose of point prediction.
Rather than extending the model to include Bayesian inference over those hyperparameters (which is analytically intractable), a most-probable point estimate, α MP , may be found via a type II maximum likelihood procedure.That is called sparse Bayesian learning which is formulated as the local maximization with respect to α of the marginal likelihood, or equivalently, its logarithm L(α): Where The covariance, Σ MP = Σ, can be obtained by substituting α = α MP into A in (7), so that the posterior mean weight, μ MP , is obtained by evaluating (7) again with Σ =Σ MP , giving a final (posterior mean) approximator: where Y is the prediction of the model output with the unseen input data X * .One crucial observation is that typically the optimal values of many hyperparameters are infinite [16].With (7), this leads to a parameter posterior infinitely peaked at zero for many weights w i with the consequence that μ MP correspondingly comprises very few nonzero elements.
A recent analysis has showed that L(α) has a unique maximum with respect to α i [16]: and from these, it simply follows: Note that when α i = ∞, s i = S i and q i = Q i , then, it is convenient to utilize the Woodbury identity to obtain the quantities of interest: The results of ( 14) and (15) imply that (2) if φ i is excluded from the model (α i = ∞) and q 2 i > s i , φ i can be added (i.e., set α i to some optimal finite values).
To train and update the RVM model dynamically, a sequential learning algorithm is required.The algorithm starts with an empty model, and sequentially adds basis functions to increase the marginal likelihood, and modify their weights.Within the same principal framework, the likelihood can also be increased by deleting those basis functions which subsequently become redundant.Since this algorithm sequentially adds or deletes the basis function to or from the model, the likelihood can be continually increased by adding and deleting basis function and this mechanism make online model update feasible.The steps of the sequential learning algorithm are shown below.
(2) Initialize S n and Q n with a single basis vector φ i from ( 13) and ( 14) and compute new α n from (11) which can be simplified as (3) Explicitly compute Σ and μ (which are scalars initially), along with initial values of s i and q i for all N basis functions φ i .
(4) Select a candidate basis vector φ i from the set of all N basis functions. ( then reestimate α i . (7) If θ i = 0 and α i = ∞, then add φ i to the model with updated α i .
(8) If θ i < 0 and α i < ∞, then delete φ i from the model and set α i = ∞.
(11) If converged then terminate, otherwise go to Step 4.
It has to be noticed that the RVM modelling algorithm is only a multi-input but single-output modelling method.Therefore, individual model corresponding to each output needed to be constructed.A multi-input/multioutput model is then easily be obtained by combining all the individual models.

Experimental Setup
Sample data sets are required for RVM training and are generally collected through experiments.In this study, the experiments were conducted on a naturally aspirated, watercooled, 4-cylinder, direct-injection diesel engine.The specifications of the engine are shown in Table 1.
The engine was connected to an eddy-current dynamometer, and a control system was used for adjusting its speed and torque.Ultralow sulfur diesel fuel containing less than 10-ppm-wt sulfur was adopted in the test.The experimental setup is illustrated in Figure 2.
The experiments were carried out at engine speeds of 1200, 1400, 1600, 1800, and 2000 rpm and each at engine loads of 28, 70, 140, 210, and 252 Nm.For each test, the volumetric flow rate of fuel was measured using a measuring cylinder and then converted into mass consumption rate, which is used to calculate the brake-specific fuel consumption (BSFC) and the brake thermal efficiency (BTE).The gaseous species in the engine exhaust including CO, CO 2 , and NOx, were measured on a continuous basis using the Anapol EU5000 exhaust gas analyzer which was suitable for measuring diesel engine emissions.The Anapol EU5000 used infrasensors for measuring CO and CO 2 concentrations and used chemical cells for measuring NO and NO 2 to obtain the NO x concentration.The gas analyzer was calibrated with standard and zero gases before each experiment.Particulate mass concentration was measured with a tapered element oscillating microbalance (TEOM, Series 1105, Rupprecht & Patashnick Co., Inc.).The exhaust gas from the engine was diluted before passing through the TEOM with a Dekati minidiluter.The dilution ratio (DR) was evaluated based on the following equation: where [CO 2 ] exhaust , [CO 2 ] diluted , and [CO 2 ] background represent the undiluted, the diluted, and the background CO 2 concentrations, respectively.The dilution ratio was around 8 in the tests.
At each speed and load, data were recorded after the engine had reached the steady state, which was indicated by the lubricating oil temperature and the coolant temperature.For the purpose of reducing experimental uncertainties and ensuring repeatability of test data, the data were recorded continuously for 5 minutes to reduce experimental uncertainties, and each test was carried out three times.The average values were used in this research.
Based on the measured data, the following parameters are derived: Brake thermal efficiency: Brake specific fuel consumption: where P b is the brake power calculated from the measured torque and engine speed, ṁ f is the mass flow rate of the diesel fuel and LHV is the lower heating value of the diesel fuel.

Application of RVM and Modelling Results
To evaluate the effectiveness of RVM, the prediction models were built based on the experimental data.As the collection of the experimental data is time consuming and costly, only 22 data sets corresponding to different load and speed settings were collected from the experiments, which are shown in Table 2. 18 sets of them were used as the training data for the model construction, and the rest 4 sets were used for model validation and testing.Actually, several weeks were required to collect the twenty-two data sets professionally.Table 3 illustrates the use of each of the data sets.
The measured parameters in each of the data sets can be basically separated into two categories, which are the input parameters and output parameters.Engine speed and engine load are the two most important independent parameters that affect the engine performance and emissions.They are included in the input parameters.The coolant controls the engine temperature so the coolant temperature is regarded as an important factor and is also treated as the input parameter.The brake-specific fuel consumption and the brake thermal efficiency represent the engine performance; thus, they are used as the output parameters.Moreover, the NO x and particulate matter are two most serious exhaust emissions from diesel engine.Therefore, the output parameters also consist of the NO x concentration and particle mass concentration.
The RVM modelling was implemented using MATLAB.There are three input parameters and four output parameters, indicating that four individual RVM models have to be built.Moreover, in order to have a more accurate modelling result and to prevent any input parameter from dominating the output value, the input data is conventionally normalized before training [21].In this study, all the input values were normalized within the range [−1, 1].
To verify the accuracy of the RVM model, the predicted output values is compared with the actual values from the test data sets and shown in Figures 3, 4, 5, and 6.The corresponding prediction errors are presented by the mean absolute percentage error (MAPE); they were evaluated against the experimental data sets using (21).Moreover, the fraction of variance (R-squared value) is also calculated using ( 22) and ( 23).The smaller the MAPE, the better the modelling accuracy is.In addition, the best fitness of R 2 is 1 where X k is the kth input vectors for the prediction, f (X k ) is the prediction value corresponding to X k , y k is the actual value corresponding to X k , y is the mean of the actual value, and N t is the number of test data points.

Comparison of RVM and ANN Modelling Results
To illustrate the advantages and superiority of the proposed RVM model, the prediction result was compared with a multilayer feed-forward neural network with backpropagation.Since multilayer feed-forward neural network is a wellknown universal estimator [22] and many researches for diesel engine performance modelling [5][6][7][8][9][10][11]13] were done based on this configuration, the results from it can be considered as a rather standard benchmark.A neural network with one hidden layer was built based on the same training data sets used for RVM modelling.The neural network consists of 3 input neurons, 20 hidden neurons, and 4 output neurons.In fact, the number of hidden nodes was determined by a trial and error analysis, varying the number of hidden neurons between 3 and 30, this burden demonstrates the ineffectiveness of the ANN approach.
The activation function used inside the hidden layer was the Tan-Sigmoid transfer function, while a pure linear filter was employed for the output layer.Levenberg-Marquardt algorithm was used as the training algorithm.The learning rate of the weight update was set to be 0.05. Figure 7 depicts the architecture of the neural network.
The same test sets were also chosen so that the RVM and ANN model can be compared reasonably.The prediction accuracy of each output in the ANN model is illustrated in Figures 8, 9, 10, and 11 and Table 5.
Tables 4 and 5 show that the RVM outperforms the ANN by about 36.45% in terms of average MAPE under the same test sets.The relatively high training MAPE of the ANN shows that the data sets is not sufficient for building such a highly nonlinear model.Furthermore, only one initial  value σ of the basis width is required by RVM, while the learning rate, number of hidden layers, and number of hidden neurons are required in ANN, which means a grid of guessed values for these parameters have to be prepared.The MAPEs of both RVM and ANN for predicting the mass concentration of particulates are relatively large as compared to the other output parameter.This is because the variation of the mass concentration ranges from 0 to 12 × 10 4 μg/m 3 .Actually, the RVM model tries to fit a function for the whole range rather than focusing on the low end of range, which is seen by the R-squared value of 0.97.In contrast, the ANN model tends to concentrate at the low end of the value.As a result, the R-squared value for the ANN model is only 0.02, which is unacceptable.Overall, the prediction accuracy of RVM with a small amount of training data is satisfactory.

Conclusions
This research is the first attempt at applying RVM to model the diesel engine performance and emission characteristics of NO x and particulate matter under the condition of rare data.Although the combustion process of the diesel engine is unknown, the RVM model has successfully demonstrated the relation between the controllable factors, which are the engine speeds, engine loads, and coolant temperature, and the output variables, including the brake-specific fuel consumption, brake thermal efficiency, NO x emission, and particulate mass concentrations.Experimental results show that the RVM model is still acceptable even if the data sets are few.It is believed that more data sets can improve the accuracy of the model.Furthermore, the RVM model is also compared with an ANN model.The results indicate that the average accuracy of the RVM model is higher than that of the ANN model by about 36.45%,implying that RVM is superior to ANN.
With the proposed RVM model, experimental efforts can be reduced significantly as the performance and emissions of the diesel engine can be predicted easily.By applying this RVM model as a virtual sensor on diesel vehicles, the exhaust emissions can be controlled more effectively by incorporating with some advanced control algorithms, such as model predictive control.The study of model predictive diesel emission control based on RVM model will be considered as a future work.Since RVM can also perform online model update, the applications of RVM to online system modelling and online control will also be explored in the future.

Figure 3 :
Figure 3: Comparison between RVM predicted values and the corresponding actual values for BSFC.

Figure 4 :
Figure 4: Comparison between RVM predicted values and the corresponding actual values for BTE.

Figure 5 :
Figure 5: Comparison between RVM predicted values and the corresponding actual values for NO x concentration.

3 )Figure 6 :
Figure 6: Comparison between RVM predicted values and the corresponding actual values for PM mass concentration.

Figure 7 :
Figure 7: Architecture of the neural network.

Figure 8 :Figure 9 :
Figure 8: Comparison between ANN predicted values and the corresponding actual values for BSFC.

Figure 10 : 3 )Figure 11 :
Figure 10: Comparison between ANN predicted values and the corresponding actual values for NO x concentration.

Table 2 :
Experimental dataset for model training and validation.

Table 3 :
Data set assignment.(T and X refer to training sets and test sets, resp.)

Table 4
summarizes the training MAPE, the MAPE over the test data sets, and the fraction of variance for each output parameter of the RVM model.

Table 4 :
Results of the RVM models.

Table 5 :
Results of the ANN model.