Improved Extended Multiplicative Scatter Correction Algorithm Applied in Blood Glucose Noninvasive Measurement with FT-IR Spectroscopy

. In order to improve the predictive accuracy of human blood glucose quantitative analysis model with fourier transform infrared (FT-IR) spectroscopy, this paper uses a method named improved extended multiplicative scatter correction (Im-EMSC), which can effectively eliminate the scattering effects caused by human body strong scattering. The principal components of the differential spectra are used instead of the pure spectra of the analytes in this algorithm. Calibrate the unwanted physical characteristic through the shape of the curve of principal components, and extract the original glucose concentration information. Im-EMSC can efficiently remove most of the pathlength difference and baseline shift influences. Firstly, Im-EMSC is used as a preprocessing method, and then partial least squares (PLS) regression method is adopted to establish a quantitative analysis model. In this paper, the result of Im-EMSC is compared with those popular scattering correction algorithms of multiplicative scatter correction (MSC) and extended multiplicative scatter correction (EMSC) preprocessing methods. Experimental results show that the prediction accuracy has been greatly improved with Im-EMSC method, which is helpful for human noninvasive glucose concentration detection technology.


Introduction
Diabetes and its complications have been a heavy burden on the society.According to the International Diabetes Federation (IDF) latest statistics, there are 371 million individuals with diabetes worldwide in 2012 [1].The control of blood glucose levels relies on blood glucose measurement.The tradition finger-prick way to measure blood glucose level is painful, potentially dangerous, and expensive to operate.In the last decades, many noninvasive methods [2][3][4][5][6][7] have been studied to measure blood glucose level.
Optical methods have been developed into the most powerful optical techniques of biomedical research and clinical application in noninvasive approaches for glucose monitoring in the last twenty years [8].This noninvasive glucose measurement eliminates the painful pricking experience, risk of infection, and damage to finger tissue.The optical measurement of blood glucose is based on the light magnitude absorbed by glucose in blood at glucose absorption peaks, but the measurement accuracy is still a barrier due to the weak signal from blood and interference of other blood components [9].The mid-infrared (MIR) spectroscopy method is one of the most promising optical approaches.The absorption of glucose can be less influenced by other substances in mid-infrared region, with the narrow absorption peak [10], which makes it more easy to extract the glucose concentration information from the blood spectra.But because of the human body strong scattering effect, nonlinear relationship exists between glucose concentration and absorbance spectra [11,12].In order to successfully use FT-IR spectroscopy technique in noninvasive blood glucose measurement, the calibration model must provide a stable and predictive capacity.Therefore, it is important to eliminate the human body strong scattering impact and improve the robustness of the model.An improved extended multiplicative scatter correction (Im-EMSC) method is used to effectively solve these problems in this paper.

Experimental
Instrument.An attenuated total reflectance (ATR) accessory linked to a Thermo Nicolet 6700 FT-IR spectrometer, produced in the United States, was used.The ATR accessory was made of ZnSe crystal.The FT-IR spectrometer is equipped with a liquid nitrogen-cooled mercury cadmium telluride detector.The scanning range is 400∼ 4000 cm −1 , with 16 scan times, a resolution of 4 cm −1 , and a gain of 1.
The reagent used is oral glucose powder produced by Peking University Third Hospital in Beijing, China.

Acquisition of FT-IR Spectra.
Experiment procedure is described here.A healthy volunteer had been fasting for 8 h before this experiment began.The measurement site is the middle finger of right hand, which was in close contact with the cleaned ATR crystal in the experiment.Then he drank 100 mL water with 75 g glucose within 5 minutes; in succession the FT-IR spectra were collected from the finger every 12 minutes after cleaning the finger.At the time of sampling, the measurement position, measurement pressure, and the psychology of the volunteer were kept invariable as far as possible.In the meantime the corresponding blood glucose reference values were measured by the OneTouch Ultra 2 blood glucose meter produced by Johnson and Johnson Company, USA.
The experiment took 3 hours and a total of 17 FT-IR spectra were collected, including 11 spectra acquired in the first day with a glucose concentration range of 91.8∼ 140.4 mg/dL and 6 spectra acquired in the second day with a glucose concentration range of 97.2∼142.2mg/dL.

Im-EMSC Algorithm.
Stark and Martens developed multiplicative scatter correction (MSC) into the extended multiplicative scatter correction (EMSC) in 1989 [13].The EMSC method employs the pure spectra of the analytes and interference effects to improve the optical pathlength estimation.Then, it is possible to reduce or eliminate the pathlength difference due to human body strong scattering in the preprocessing stage [14].However, EMSC cannot be wildly used due to a lack of the pure spectrum of chemical matter.In this paper, an improved EMSC (Im-EMSC) has been adopted.For this method, the principal components of the differential spectra are used instead of the pure spectrum of the interested analytes.Consequently, the scattering effects are corrected without any pure spectrum information.
The infrared spectral analysis is based on Lambert Beer's law [17]; under ideal conditions, the absorbance data  ,chem can be seen as a sum of the contributions from the different chemical constituents with spectra  = {  ,  = 1, 2, . . ., } and concentrations   = {  ,  = 1, 2, . . ., }: Actually, there is a certain translation and rotation relationship between the measured spectra and the ideal spectra, taking into account that the scattering coefficients at all wavelengths are not the same; EMSC method expresses the measured spectra as follows: where   is a measured spectrum,  ,chem is ideal spectrum,   is identity matrix,  is the wavelength, and   ,   ,   , and   are scalar parameters obtained by calibration.
So the equation can be rewritten as follows: Spectrum ( 1)  ,chem may be rewritten as a deviation from a reference spectrum , which could be, for example, the average of a set of empirical spectra as follows: In ( 4), Δ , represents the deviations in the analyte and interference concentration compared with that of the reference sample.
Define the differential spectrum matrix  dif : where  dif can be processed by principal components analysis (PCA); each load of the components can represent a specific factor.Then revoice  dif as follows: where    is the principal component and   is the coefficient of the principal components.
Select the numbers of  as needed; these principal components not only contain the concentration differential information, but also include the different physical aspects (optical pathlength, the surface state, etc.).Determine the principal components that represent concentration information and physical characteristics through the shape of the curve of principal components.Calibrate the unwanted physical characteristic according to the actual needs and highlight information of the chemical concentration.Replace  ,chem by  dif and combine (3) and ( 6); then where ℎ , =     .Let  = [  ; ; ; ;  2 ], and let   = [  ,   , ℎ  ,   ,   ]; construct calibration models using a multivariate linear calibration method such as PLS.The unnecessary principal components are defined as the interference factors and the left are defined as the effective factors.Correction spectrum can be obtained by subtracting the interference factors.

Model Selection and Comparison.
Two data sets were prepared.One is training set, consisting of 11 spectra acquired in the first day.The other is test set, consisting of the rest 6 spectra acquired in the second day.The wavelength region of 800-2000 cm −1 was selected for calibration and predication.The original spectra are shown in Figure 1.
In this paper, the raw spectra were corrected by Im-EMSC first.The corrected spectra are shown in Figure 2.
Figure 2 shows the preprocessed spectra by Im-EMSC.All the spectra were normalized to an average estimated baseline level and an average estimated pathlength level.The variability in the spectra was much smaller.
PLS regression was constructed.Prediction results of the test set are shown in Table 1. Figure 3 gives the detailed comparison of the three preprocessing methods: Im-EMSC, EMSC, and MSC.

Analysis of Experimental Results
. The predicted results of PLS regression after preprocessing by three methods are displayed in Table 1.RMSEP denotes the predictive accuracy of the calibration model. denotes the correlation coefficient.
Table 1 shows that the best preprocessing method for scattering correction is Im-EMSC.Compared with the results from original data, RMSEP decreases from 9.3 mg/dL to 8.8 mg/dL.The predication accuracy increases by 5.4%.In addition,  increases from 0.86 to 0.95.The success of Im-EMSC is attributed to its avoiding of the request of matter pure spectrum, which limits the use of EMSC.
Because MSC and EMSC generated the overcorrection phenomenon, the prediction accuracy reduced in the experiment.As far as the methods of EMSC and MSC are concerned, scattering effects are assumed as a shift in the baseline and the average spectrum is used as a reference spectrum to eliminate the shift in the algorithm [18].When  the range of the concentrations of an interesting target is large, sometimes the actual chemical absorption information is corrected as a baseline shift.The spectra with smaller concentrations information are overcorrected, which result in the low predication accuracy.

Conclusions
Due to human body strong scattering, the optical pathlength difference and baseline shift exist in blood glucose noninvasive measurement.In order to solve the problems and increase the predication accuracy, an appropriate method must be adopted in the preprocessing stage.Im-EMSC takes account of the wavelength effects, and simultaneously the principal components are used instead of a prior knowledge about the analytes information.The scattering effects are absolutely corrected without any pure spectrum information.Im-EMSC is promising because it can raise the model prediction capability and robustness in human blood glucose noninvasive measurement using FT-IR spectroscopy.

3. 2 .
Software.The scattering correction algorithms and all the calculations were implemented in Matlab 2011 b.

Figure 1 :Figure 2 :
Figure 1: (a) The original spectra of the training set and (b) the original spectra of the test set.

Figure 3 :
Figure 3: Distribution of predication results with different preprocessing methods.(a) The distribution of predication results of original spectra, (b) the distribution of predication results of the spectra preprocessed by Im-EMSC, (c) the distribution of predication results of the spectra preprocessed by Im-EMSC, and (d) the distribution of predication results of the spectra preprocessed by MSC.

Table 1 :
Prediction results of the test set with different preprocessing methods.