Metal oxide sensors are the most often used in electronic nose devices because of their high sensitivity, long lifetime, and low cost. However, these sensors suffer from a lack of response stability making the electronic nose systems useless in industrial applications. The sensor instabilities are particularly caused by incomplete recovery process producing gradual drifts in the sensor responses. This paper focuses on a signal processing method combining baseline manipulation and orthogonal signal correction technique in order to reduce effectively the drift impact from the sensor outputs. The proposed signal processing is explored using experimental data obtained from a gas sensor array responding to various concentrations of pine essential oil vapors. Partial Least Square method is then applied on the corrected dataset to establish a regression model for the estimation of gas concentration. In this work, we show essentially how our drift correction approach can help to improve significantly the stability of the regression model, while ensuring good accuracy.
Gardner defines the electronic nose (E-nose) as “an instrument, which comprises an array of electronic chemical sensors with partial specificity and an appropriate pattern-recognition system, capable of recognizing simple or complex odors” [
A lot of correction methods have been investigated to improve sensor response stabilities; they are based on different approaches: univariate or multivariate methods. In univariate technique the correction is applied on each sensor individually. Among these methods the baseline manipulation, largely used in industry [
In case of E-nose measurements, since the drift effects are correlated, the multivariate methods allow capturing more information from all the sensors permitting modeling more complex or nonlinear drift effects [
When gas sensors are exposed to the same gas under the same sampling and environmental conditions, any changes in the sensor response are related essentially to drift. In order to reduce the variance of this drift that follows one direction, the multivariate linear correction methods based on partial least square PLS or Principal Component Analysis, PCA, constitute the best approach [
The orthogonal signal correction (OSC) was proposed firstly by Wold for NIR spectra correction [
In this study, we combine baseline manipulation with OSC technique in order to remove the drift effects on a MOX sensor array. Then PLS is used to model the behavior of our gas sensors responding to different concentrations of pine essential oil vapors (EO) diluted in pure air. The combination of this correction approach gives a good quantification of essential oil vapors by using only a few components of PLS in the modeling.
The data used in this work are obtained from a home-made experimental equipment mainly composed of a gas sensor cell and an EO vapor diffuser (Figure
(a) Schematic representation of the experimental set-up and (b) typical temporal response of a MOX gas sensor during exposition and regeneration phases.
According to our previous study [
Sensor outputs are digitalized, filtered, and then recorded every second in terms of sensor conductance values. We have opted to express the sensor response in conductance rather than resistance because this parameter is more efficient when gas concentrations identification with n-type semiconductor metal oxide sensors is demanded [
Forty measurements have been realized for each of the nine EO concentrations randomly selected throughout the experiments. Each sensor output is characterized by 425 recorded points (75 points during gas exposition, 350 points at recovery process). Data are arranged on a dataset formed from 2975 columns
To illustrate the instability of the temporal responses, we have grouped on the same axis the signals for 1, 2, 3, and 4% EO concentration of each sensor (Figure
Temporal responses at 1 to 4% EO concentrations for each of the seven gas sensors (TGS880, TGS822, TGS2620, MQ3, MQ138, SP31, and SPAQ1).
Illustration of the TGS2620 gas sensor instability in terms of its initial conductance during successive measurements by varying the EO concentrations.
This elementary comparison confirms the nonefficiency of the sensor recovery process causing noticeable drifts. In fact, these drifts will be greater in case of real-time and continuous measurement with an E-nose.
We have grouped all the sensor responses
The reliability of E-nose results depends strongly on how the sensor outputs are treated particularly to minimize noises and drift affects (shown in previous section). So, the signal processing has a key role in E-nose performance and many studies were already done on this subject.
Baseline manipulations are very often cited in literature to remove the drift effects on sensor responses [
For further development of the drift correction, the dataset named
Prior to applying the regression modeling, we followed the baseline manipulation of OSC technique to reduce more efficiently the drift effects from the sensor signals. We show that the use of this correction technique improves the calibration processes making it reliable and stable.
The main objective of the OSC technique is to remove the variance which is not correlated to the variation of concentration
The algorithm for OSC [ Use Principal Component Analysis (PCA) to decompose Orthogonalize the first score where Calculate weight vectors Calculate the new score Compute the loading vector The new corrected matrix
In order to test the benefits of the OSC technique, the new dataset (
For each gas sensor, applying OSC technique makes the responses at the same EO concentrations more similar. Figure
Temporal responses at 1 to 4% EO concentrations after using OSC technique for each of the seven gas sensors (TGS880, TGS822, TGS2620, MQ3, MQ138, SP31, and SPAQ1).
For better perception of the OSC impact on gas quantification, we have plotted in Figure
PCA score plots: (a) before OSC correction and (b) after OSC correction.
As the predictors in our dataset are highly correlated and their number is very large (number of columns) by comparison with the number of observations (number of lines), the use of multiple linear regression (MLR) model is not suitable because of the existing multicollinearity [
We have performed the calibration of our E-nose by using PLS regression as recognition method. In this aim, dataset is divided into 12 segments with 33 observations each: one segment is used as data test to evaluate the performance of the calibration model when the other segments (11 segments) are used as learning data. To obtain a valuable indicator based on cross validation, this operation is performed 12 times, and successively the data test is changed. So, each of the 396 observations becomes a predicted one. The RMSE was calculated by averaging the twelve RMSEs.
In Figure
Plot of RMSE versus number of PLS or OSC + PLS components.
To investigate the stability of the pattern, we compare between the variability of regression coefficients obtained by PLS or OSC + PLS. Each sensor output is characterized by 425 points; hence the number of regression coefficient is (425
The coefficient of variation (CV) is calculated for all the regression coefficients in the two cases (PLS analysis or OSC + PLS analysis) for seven cycles. In the first cycle we started to build a model using one component and we added one more component in each cycle until we used all of them (7 components) in the final cycle. Dataset was divided on 12 subsets, so in one cycle we have calculated 12 times the regression coefficients and the RMSE. At the end of each cycle, CVs are calculated as the ratio of the standard deviation over the mean value of each coefficient and TOTAL_RMSE as the average of twelve RMSE. These results are presented in Tables
CV values of regression coefficient and TOTAL_RMSE along with number of OSC + PLS components.
CV | Number of variables | ||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
|
|
0,0476 | 0,0470 | 0,0474 | 0,0474 | 0,0470 | 2,5036 |
|
|
0,0135 | 0,0208 | 0,0231 | 0,0299 | 0,0352 | 2,5372 |
|
|
0,0051 | 0,0141 | 0,0124 | 0,0060 | 0,0072 | −2,5057 |
|
|
0,0113 | 0,0105 | 0,0110 | 0,0099 | 0,0093 | 2,6227 |
|
|
0,0096 | 0,0316 | 0,2056 |
|
|
2,4622 |
|
|
0,0198 | 0,0248 | 0,0242 | 0,0179 | 0,0139 | −2,5424 |
|
|
0,0096 | 0,0127 | 0,0144 | 0,0379 | 0,0394 | −2,4941 |
|
|
0,0051 | 0,0122 | 0,0228 | 0,0572 | 0,0348 | −2,5388 |
|
|||||||
TOTAL_RMSE |
|
0,2449 | 0,2447 | 0,2446 | 0,2446 | 0,2446 | 0,494 |
CV values of regression coefficient and TOTAL_RMSE along with number of PLS components.
CV |
Number of variables | ||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | |
|
−0,0176 | 0,0674 | 0,1342 | 0,0479 | 0,1913 | 0,0839 |
|
|
0,0030 | 0,2424 | 0,3032 | 0,0754 | 0,0282 | 0,01317 |
|
|
0,0023 | 0,0614 | 0,0403 | 0,02821 | 0,1543 | −0,4366 | − |
|
0,0028 | 0,0456 | 0,4243 | −0,2263 | 0,0503 | 0,04381 |
|
|
0,0026 | −0,8605 | 2,1887 | 0,1037 | 9,1080 | 0,0993 |
|
|
0,0019 | −0,0188 | −0,0463 | −0,02986 | −0,0257 | −0,01194 | − |
|
0,0033 | −0,1652 | −0,3465 | 0,06188 | 0,3475 | −0,1358 | − |
|
0,0024 | −0,0341 | −0,2107 | −0,0621 | −0,0203 | −0,0167 | − |
|
|||||||
TOTAL_RMSE | 0,4442 | 0,3725 | 0,3569 | 0,30796 | 0,2522 | 0,24428 |
|
As we can see, in the case of OSC + PLS the CVs of the regression coefficient are approximately 10 times lower than those obtained in the case of applying only PLS. Moreover, to obtain the best result for TOTAL_RMSE, we need 1 component if applying OSC and 7 components without OSC.
Consequently, we have chosen the model with one component for OSC + PLS and the model with 7 components for PLS because they give the best results, and also because they have approximatively the same RMSE allowing us to compare the stability of regression coefficients independently. This statement is confirmed in Figure
Boxplots of the distributions of the coefficients regression value for OSC + PLS and PLS models.
For better comparison of regression coefficients stability we have plotted on the same figure the absolute value of CV; Figure
Magnitude of CV for the regression coefficients of OSC + PLS and PLS model.
The main challenge in E-nose field is based on sensor signal processing, particularly to correct the gas sensor drift affects. As a first step, a fractional baseline correction is suggested by the use of a reference value corresponding to the sensor conductance taken at the end of the cleaning phase (
The authors declare that they have no conflicts of interest.