Nonlinear Regression with High-Dimensional Space Mapping for Blood Component Spectral Quantitative Analysis

Accurate and fast determination of blood component concentration is very essential for the efficient diagnosis of patients. This paper proposes a nonlinear regression method with high-dimensional space mapping for blood component spectral quantitative analysis. Kernels are introduced to map the input data into high-dimensional space for nonlinear regression. As the most famous kernel, Gaussian kernel is usually adopted by researchers. More kernels need to be studied for each kernel describes its own high-dimensional feature space mapping which affects regression performance. In this paper, eight kernels are used to discuss the influence of different space mapping to the blood component spectral quantitative analysis. Each kernel and corresponding parameters are assessed to build the optimal regression model. The proposed method is conducted on a real blood spectral data obtained from the uric acid determination. Results verify that the prediction errors of proposed models are more precise than the ones obtained by linear models. Support vector regression (SVR) provides better performance than partial least square (PLS) when combined with kernels. The local kernels are recommended according to the blood spectral data features. SVR with inverse multiquadric kernel has the best predictive performance that can be used for blood component spectral quantitative analysis.


Introduction
The component concentration in human blood may be an indicator of some diseases.Fast and accurate determination is very essential to the early diagnosis of the diseases.For instance, the serum uric acid (UA) level can be used as an indicator for the detection of diseases related to purine metabolism [1][2][3] and leukemia pneumonia [4][5][6].Various analytical methods are developed for the determination of UA.These include electrochemical and chemiluminescence [7,8], high-performance liquid chromatography [9,10], and spectroscopic quantitative analysis [11,12].As spectroscopic quantitative analysis only requires a small sample with easy preparation, the method for blood analysis attracts more attention [13,14].
In a spectroscopic quantitative analysis, when radiation hits a sample, the incident radiation may be absorbed, and the relative contribution of absorption spectrum depends on the chemical composition and physical parameters of the sample.A spectrometer is used to collect a continuous absorption spectrum.The concentration of the component could be predicted by a regression algorithm [15][16][17].Partial least square (PLS) regression and support vector regression (SVR) models have been applied [18][19][20].PLS focus on finding the wavelengths that have the closest relationship with the concentration regression.SVR operates on the structural risk minimization (SRM) principle and the Vapnik-Chervonenkis (VC) theory [21,22].SVR uses the SRM principle instead of the traditional empirical risk minimization (ERM) which equips the model great generalization.The wildly used linear regression models may not be guaranteed in practice for some restrictions on the spectral data [23,24].The spectral data collected by low precision system always exhibits a characteristic of nonlinearity.Moreover, a high concentration of blood component may be beyond the optical determination linear range [25].Samples with higher concentration should be diluted to meet the linearity requirement.The kernel method can be introduced to overcome the restriction [26].Using the kernel method, the input vectors are mapped into a higher dimensional feature space and makes nonlinear problems into linearly or approximately linearly problems.A kernel can be any function that meets Mercy's condition.Kernel-based regression methods are reported for spectral quantitative determination.The comparison of SVR with Gaussian kernel and other four nonlinear models are presented by Balabin and Lomakina [27].The performance of SVR with Gaussian kernel and PLS model for fruit quality evaluation is compared by Malegori et al. [28].The evaluation of SVR with Gaussian, polynomial, sigmoid, linear kernel, and PLS for Biodiesel content determination is discussed by Alves and Poppi [29].As the most famous kernel, Gaussian kernel is usually adopted by most of the researchers (three more traditional kernels are discussed by Alves and Poppi).Different kernels describe different highdimensional feature space mapping which affects regression performance [30,31].More kernels are needed to be discussed to evaluate the high-dimensional space mapping.Moreover, PLS can be extended to nonlinear regression and compared with SVR with the same kernel.
Nonlinear regression with high-dimensional space mapping for blood component spectral quantitative analysis is discussed in this paper.Kernels are incorporated with PLS and SVR to realize nonlinear regression in the original input space.The kernel extension of PLS and SVR is completed by replacing the dot product calculation of elements with the kernel.Eight kernels are used in this paper to discuss the influence of different space mapping to the blood component spectral quantitative analysis.Each kernel and corresponding parameters are assessed to build the optimal nonlinear regression model.The dataset obtained from spectral measurement of uric acid concentration is used to evaluate the effectiveness of the proposed method.The experiment results are analyzed, and the mean squared error of prediction (MSEP) is used to compare the predictive capability of the various models.
This article is organized as follows.The methods are introduced in Section 2. The experimental process is explained in Section 3. The results are analyzed in Section 4. Finally, Section 5 concludes the paper.

The Methods
where the latent vectors T and U are linear combinations of input and output variables.

SVR.
For SVR, a linear regression can be performed between the matrix of wavelength signals X and corresponding blood component concentration Y: Y = ωX + B, where ω is the matrix of weight coefficients and B is a bias vector.
According to Lagrange multiplier and Karush-Kuhn-Tucker (KKT) condition where x i is a variable of matrix X, α i and α i * are the corresponding Lagrange coefficients, and l is the number of samples.The linear regression equation can be written as 2.3.High-Dimensional Mapping.The regression ability of linear model could be enhanced by mapping the input data into high-dimensional space.By using the kernel method, the algorithm realizes a prediction in high-dimensional feature space without an explicit mapping of original space.A kernel describes the function of two elements in the original space which is concerned to be the dot product of them in feature space.A kernel extension of a linear algorithm can be completed by replacing the dot product calculation of elements.
The combination kernel extension of PLS and SVR will be introduced.
Kernel PLS is a nonlinear extension of PLS.A nonlinear mapping Φ x ∈ R N → Φ x ∈ F is used to transform the original data into a feature space.When a linear PLS regression is constructed, a nonlinear PLS is obtained for original input data.The kernel gram matrix K can be calculated in the following form: K = ΦΦ T .The component concentration regression model comes out as where Ŷ and Y are the output variables of validation set and calibration set, Φ v is the matrix of validation variable feature space mapping, latent vectors T and U are linear combinations of input and output variables, and K v is the matrix composed of K ij = K x i , x j , where x i and x j are input variables of validation set and calibration set.The nonlinear regression can be determined when the kernel function is selected.Kernel extended SVR, the concentration of the component, is calculated by the regression function: where Y is the output variable of the validation set, Φ v is the matrix of validation variable feature space mapping, and K v is the matrix composed of K ij = K x i , x j , where x i and x j are input variables of the validation set and calibration set.
The kernel extended SVR is completed.Kernel determines the feature of high-dimensional space mapping and affects the regression performance.To build the optimal nonlinear regression model, different kernels should be evaluated combined with PLS and SVR.The kernels [32] used in the experiments are the following: (1) Linear kernel: Linear kernel has no parameter.Actually, KPLS turns into PLS, and SVR turns into LinearSVR when linear kernel is adopted.
(3) Polynomial kernel: The kernel parameter d is the degree.
The prediction performance of high-dimensional mapping by the kernels introduced and the related parameter optimization will be discussed in the next section.

Experimental
3.1.Dataset.To evaluate the effectiveness of nonlinear regression with high-dimensional space for blood component spectral quantitative, the UA dataset is used in the experiment.
200 samples are obtained by uric acid concentration spectral determination experiment.Each spectrum has 601 signals from 400 nm to 700 nm with a 0.5 nm interval.The UA concentrations from 105 to 1100 μmol/L are evaluated.A spectrum of the UA data is shown in Figure 1.

Experimental Procedure.
In order to assess the prediction effect of high-dimensional space mapping nonlinear regression for blood component spectral quantitative analysis, the linear, Gaussian, polynomial, inverse multiquadric, semilocal, exponential, rational, and Kmod kernels are combined with PLS (abbreviated as PLS, GKPLS, PKPLS, IMKPLS, SLKPLS, EKPLS, RKPLS, and KKPLS) and SVR (abbreviated as LinearSVR, GSVR, PSVR, IMSVR, SLSVR, ESVR, RSVR, and KSVR) to build the prediction models for the uric acid dataset and the effectiveness of these models are evaluated.
For the experiments, the dataset should be split into the calibration set and the validation set.The dataset is divided based on the shutter grouping strategy.One sample is selected into the validation set every five samples, and the rest samples are into the calibrating set.Out of the total 200 samples, 40 samples are used as the validation set while the left 160 samples the calibrating set.The calibrating set is used for building the prediction model, and the validation set is adapted for evaluating the effectiveness of the model.Both the spectral signals and the reference UA concentrations for the two sets are normalized according to the values of the calibration set.
To compare the prediction effect with different kernels, kernel parameter and related parameters will be optimized.The kernel parameter 1/2σ 2 and c 2 search ranges are [2 −8 , 2 8 ] in steps of 2 0 5 for Gaussian, semi-local, exponential, inverse multiquadric, rational, and Kmod kernels.The kernel parameter d search ranges are [1,5] in steps of 1 for polynomial kernel.For kernel PLS, the search ranges are [1,30] in steps of 1 for the number of latent variants N lv .For SVR, 3 Journal of Spectroscopy the search ranges are [2 −4 , 2 10 ] in steps of 2 0 5 for penalty parameter C and [2 −8 , 2 -1 ] in steps of 2 for nonsensitive loss ε.
Grid search based on cross-validation is used for parameter optimization.Different combinations of the parameters will be tested for each kernel on the calibration set using the 10-fold cross-validation method.In the 10fold cross-validation, data are divided into 10 groups, 9 groups are used as the training data, and the left group is used as the test data.Change the test group next time until all the groups are tested.The cross-validation is then repeated 10 times, and the 10 results are averaged as the final prediction.The combination of parameters for each kernel with minimum MSECV is adopted to build the regression model.
The MSEP for the validation set, the squared correlation coefficient for the validation set (R 2 p ), the MSECV for the calibration set, and the cross-validation correlation coefficient calculated by 10-fold cross-validation for the calibration set (R 2 cv ) are used to assess and compare the predictive ability of the various models.
In the next section, the parameter influence on the MSECV of cross-validation for each kernel introduced above will be discussed.The experiment results of kernel prediction capability will be evaluated on the validation data.

Results and Discussion
For each kernel, the curves of parameter optimization for KPLS and SVR are shown in Figures 2 and 3.The analytical results for the UA dataset are summarized and arranged in order of MSEP in Table 1.
The influence of N lv and kernel parameter on the MSECV for KPLS is shown in Figures 2(b)-2(h).Figure 2(a) is the plot of linear kernel (PLS) which describes the relationship between the N lv and MSECV as the linear kernel has no kernel parameter.Figure 2(a) shows that MSECV of PLS reduces rapidly in the beginning and reaches the minimum equivalent to 10, then increased.Figures 2(b), 2(e), and 2(f) describe the curves of Gaussian, semi-local, and exponential.The three kernels have the same parameter σ, and their curves have some familiar features.The N lv has a prior influence on MSECV when σ is small.The best MSECV can be achieved in this area.When 1/2σ 2 is close to 1, MSECV grows quickly with the increasing of σ and then becomes stable when it exceeds a certain value.Figure 2(c) shows that while the degree parameter (d) of polynomial kernel varies from 1 to 5, and the MSECV increases steadily.The same as PLS, the minimum of MSECV is achieved when N lv is set to 10. Figures 2(d), 2(g), and 2(h) show the curves of inverse multiquadric, rational, and Kmod which have the same kernel parameter of c 2 .The MSECV reduces rapidly with the increasing of c 2 when c 2 is smaller than 1.When c 2 is big, the N lv has a major impact on MSECV.
The influence of the penalty parameter C and kernel parameter on the MSECV with the optimal ε obtained for different kernels combined with SVR is shown in Figures 3(b)-3(h).For linear kernel (LinearSVR), Figure 3(a) describes the relationship among the MSECV and C and ε.The MSECV reduces with the increasing of C. When C goes up to 32, the MSECV reaches the minimum and then increased.The curves of Gaussian, semi-local, and exponential (Figures 3(b), 3(e), and 3(f)) are probably similar.The MSECV reduces quickly with the reducing of C at first.When 1/2σ 2 is close to 1, MSECV begins to level off.The C has an inferior impact on MSECV.In Figure 3(c), MSECV grows rapidly with the rise of the polynomial kernel parameter d, the penalty constant C makes little change to the MSECV.Figures 3(d), 3(g), and 3(h) present that the MSECV of inverse multiquadric, rational, and Kmod kernels reduces with the increase of kernel parameter c 2 in general.The MSECV of Kmod kernel rises slightly at the beginning and then dropped.With the increasing of C, the general changing trend of MSECV is reducing.Unlike the two kernels, C makes no significant effect on the MSECV of rational kernel.The analytical results for UA dataset are summarized in Table 1.MSEP and R 2 p based on the validation set and MSECV and R 2 cv associated to the calibrating set are presented.The optimized parameters of models are also listed.
For KPLS, SLKPLS achieves the most accurate prediction with the lowest MSEP and the highest R 2 p .According to the MSEP, the prediction performance of SLKPLS is regarded as the best (MSEP is 1880.18).Second is that of the GKPLS (MSEP is 2347.11)and then IMKPLS, RKPLS, KKPLS, EKPLS, and PKPLS.
Linear kernel (PLS) produces the worst prediction performance with MSEP of 8554.57.For SVR, the IMSVR has the best predictive capability with MSEP of 1523.42 followed by RSVR (MSEP is 1528.66),KSVR (MSEP is 1530.01),GSVR (MSEP is 2021.93),SSVR (MSEP is 2359.86),ESVR (MSEP is 2971.75),PSVR (MSEP is 5518.49),and LinearSVR (MSEP is 5519.22).It is obvious that the traditional linear regression algorithm cannot perform well on blood component spectral quantitative analysis.PLS has the highest MSEP and then LinearSVR.IMSVR exhibits the best performance on the validation set.The MSEP values of the IMSVR are 0.34%, 0.44%, 18.97%, 24.65%, 35.09%, 35.44%, 35.59%,For both PLS and SVR, the optimized kernel parameter d is 1 which makes the polynomial kernel act the same as the linear kernel.That explains that the polynomial kernel has almost the same MSEP, R 2 p , MSECV, and R 2 cv as linear kernel in this experiment.The ranking of kernels based on R 2 p is basically similar to that of MSEP.As a global kernel, polynomial or linear kernel allows data points that are far away from the test point to have an influence on the kernel values.
The other kernels used in the paper are local kernels for only data points that are close to the test point have an influence on the kernel values.The good extrapolation abilities presented by local kernels show that only some specific spectral data are essential to the blood component 6 Journal of Spectroscopy concentration prediction.The performance of critical data is enhanced during high-dimensional mapping by local kernels.Based on the above studies, IMSVR is recommended for the nonlinear regression for blood component spectral quantitative analysis.The optimal kernel parameter c 2 is 64, the penalty parameter is set to 256, and the nonsensitive loss is set to 0.003906.

Conclusions
In the paper, high-dimensional space mapping methods which combined kernels with PLS and SVR are proposed for blood component spectral quantitative.For each model, the general trend of MSECV on model parameters is discussed.Some conclusions could be drawn as follows.Initially, the blood component spectral quantitative results show that for nonlinear regression models, prediction errors are more precise than the ones obtained by linear models.Furthermore, SVR provides better performance than PLS when combined with kernels.Additionally, local kernels are recommended for high-dimensional mapping according to the blood spectral data features.Finally, the experiment results verify that the IMSVR (a local kernel combined with SVR) has the higher predicative ability and could be used for blood component spectral quantitative effectively.
mapping that introduced to complete the nonlinear regression.x is an input variable of wavelength signals and ω and B act the same role in SVR.Define kernel function: K = ΦΦ T .The component 2 Journal of Spectroscopy concentration regression model can be expressed as the following expression:

Figure 1 :
Figure 1: Spectra of the UA dataset.

Figure 2 :
Figure 2: The influence of N lv and kernel parameter on the MSECV of different kernels for KPLS on UA dataset.(a) The influence of N lv and kernel parameter on the MSECV of PLS.(b-f) The penalty constant and kernel parameter curves of GKPLS, PKPLS, IMKPLS, SLKPLS, EKPLS, RKPLS, and KKPLS.

Figure 3 :
Figure 3: The influence of penalty constant and kernel parameter on the MSECV of different kernels for SVR on UA dataset.(a) The influence of penalty constant and nonsensitive loss on the MSECV of LinearSVR.(b-f) The penalty constant and kernel parameter curves of GSVR, PSVR, IMSVR, SLSVR, ESVR, RSVR, and KSVR.

Table 1 :
Analytical results for UA dataset.: mean squared error of prediction; MSECV: mean squared error of cross-validation.R 2 p : prediction correlation coefficient; R 2 cv : cross-validation correlation coefficient; a penalty parameter (C) for SVR; number of latent variables (N lv ) for KPLS; b nonsensitive loss (ε) for SVR. MSEP