Detection of Adulteration of Panax Notoginseng Powder by Terahertz Technology

e combined terahertz time-domain spectroscopy (THz-TDS) and chemometric technology is used to detect the adulteration of similar substances in Panax notoginseng powder. Four kinds of samples are prepared in the experiment, three kinds of adulterated samples are Panax notoginseng powder adulterating with zedoary turmeric powder, Panax notoginseng powder adulterating with wheat our, and Panax notoginseng powder adulterating with rice our, respectively. e values of adulterated concentration are from 5% to 60%, the interval of adulterated concentration is 5%, and the other sample is pure Panax notoginseng powder. e modeling and prediction sets are divided by 3 :1 by class. e feature information of models is extracted by elimination of uninformative variable (UVE) method and successive projection algorithm (SPA); combining with back propagation neural network (BPNN), the UVE-BPNN and SPA-BPNN qualitative models are established, respectively. e model’s results show that the UVE-BPNN model is better; the classication accuracy of the prediction set of UVE-BPNN is 95%. en, the least square support vector machine (LS-SVM) algorithm and partial least square (PLS) algorithm are used to establish the quantitative analysis model. e model’s results show that the LS-SVM model is better among the quantitative analysis models of zedoary turmeric powder and wheat our, the correlation coecient of prediction (RP) is 0.90 and 0.93 of LS-SVM, respectively, and the root mean square error of prediction (RMSEP) of LS-SVM is 0.072 and 0.068, respectively. Among the quantitative analysis models for rice noodles, the PLS model is better, with the RP of 0.94 and RMSEP of 0.06.e results show that the combined THzTDS and chemometric technology can be used to determine the adulteration of similar substances in Panax notoginseng powder quickly, accurately, and nondestructively.


Introduction
Panax notoginseng, mainly produced in southwest China, is an essential medicinal material. Modern pharmacological researches have shown that Panax notoginseng is worldfamous for its hemostasis, anti-hypertension, anti-thrombosis, and neuroprotective e ects [1,2]. Some merchants mix other cheap powders with similar colors into the Panax notoginseng powder to achieve the purpose of replacing it with inferior ones; the wheat our, rice our, and zedoary turmeric powder which are similar to Panax notoginseng in appearance and physical properties are the most common pollutants. It is almost impossible for consumers to distinguish the purity of the powder by eyes.
At present, the primary methods for quality detection of Panax notoginseng are the high-performance liquid chromatography method, spectrometric method, the method of character identi cation, and so on. Yang et al. [3] used highperformance liquid chromatography to analyze the chemical composition and active component content of 215 samples of Panax notoginseng with di erent speci cations, di erent plant parts, and di erent geographical areas. Li et al. [4] used near-infrared spectroscopy to detect the polysaccharide content in Panax notoginseng. Li et al. [5] used uorescence spectroscopy to distinguish whether the adulterated counterfeit products were added in Panax notoginseng powder qualitatively. Shen et al. [2] used laser-induced breakdown spectroscopy(LIBS) to detect six nutrient elements in Panax notoginseng samples from 8 producing areas with high precision, and the PLS and LS-SVM quantitative analysis models were established. Meng et al. [6] took pictures of Panax notoginseng and its counterfeits under an electronic mirror, and then the identification of their micro-traits was studied. Although the high-performance liquid chromatography method has high accuracy and can analyze the chemical components in Panax notoginseng powder, it is not straightforward and challenging to operate. e testing equipment in the microscopic identification method is expensive, and the testing speed is slow. In addition, the anti-interference ability of nearinfrared spectroscopy is poor and has low sensitivity. e fluorescence spectroscopy method does not analyze the qualitative of adulteration in Panax notoginseng, and the LIBS method is only used to explore the production area of Panax notoginseng. So, the THz-TDS is proposed to determine whether Panax notoginseng powder is adulterated and how many are the value of adulteration in Panax notoginseng. Comparing with other detection technologies, it is widely used in food safety detection due to its advantages such as nondestructive, easy operation, high precision, and short detection period [7][8][9].
ere is no accurate and rapid detection technology given the adulterated Panax notoginseng with various kinds of starch in the market. e THz-TDS technology is proposed to qualitatively and quantitatively detect the adulteration of rice flour, wheat flour, and zedoary powder with different concentrations in Panax notoginseng powder. And combining with chemometric methods, the qualitative and quantitative optimal models for adulteration of Panax notoginseng powder are established, which provide the theoretical basis and experimental reference for the market to detect Panax notoginseng's adulteration.

Sample Preparation.
In this experiment, Panax notoginseng is bought from WENSHAN KANG MILLION AGRICULTURAL DEVELOPMENT CO,LTD' company, and four types of samples are prepared: Class I is pure Panax notoginseng powder, Class II is Panax notoginseng powder adulterating with zedoary turmeric powder, Class III is Panax notoginseng powder adulterating with wheat flour, and Class IV is Panax notoginseng powder adulterating with rice flour. A total of 360 adulterated samples are prepared, with the percentage of concentration ranging from 5% to 60%, and the concentration interval is 5%. e detailed information of their concentration and the number of samples are shown in Table 1.
e mixture of all adulterated samples is evenly mixed, and they are dried in a dryer to remove moisture. en, a hydraulic press is used to press the pieces under the pressure of 10 MPa for 1 minute to prepare the tablets. e shape of the tablets with a diameter of about 13 mm and a thickness of about 0.8∼1.1 mm is round. e samples' spectral data are collected by the THz-TDS instrument of TAS7400TS which Edwin Company of Japan develops.

Variable Selection Method.
e SPA, whose goal is to reduce co-linearity between different variables effectively, is a forward variable selection method [10]. e principle of SPA is to obtain the subset of variables with minimum colinearity by using the simple projection computation of vector space. First, the maximum number of selected variables is set, and the starting vector in the m-dimensional space (M is the original variable) is selected. Secondly, the high projection vector in the orthogonal subspace is chosen as the new starting vector. e UVE is a method basing on the stability analysis of regression coefficients of the PLS model [11]. is method is developed to eliminate the variables which have no useful information in the original spectral data. During the operation of the UVE algorithm, a group of random variables with the same dimension as the spectral matrix is generated manually as a reference. e stability value and threshold value are used to evaluate the reliability of each variable; the variables that the absolute values of stability are less than the critical value are deleted [8]. e stability value S is defined as follows:   Journal of Spectroscopy where the S i is the stability value of the i-th variable of the model and b i is the regression coefficient of the i-th variable in the sample of the model. mean(b i ) and std(b i ) are the mean and standard deviation of b i , respectively, and m is the number value of input variables.

Modeling Method.
e BPNN algorithm is a nonlinear multi-layer feed-forward neural network consisting of the input layer, hidden layer, output layer, and other structures [12]. Figure 1 shows the topology of the BPNN. e circle in Figure 1 represents the neuron. Spectral data are inputted from the input layer, the information is standardized, and the weight value transmitted to the hidden layer is given. e predicted value of the BPNN is obtained in the output layer, and the obtained value is compared with the expected value. Suppose the value of error is larger than the expected value. In that case, the value of error is propagated in reverse, and the threshold and weight values are adjusted until the value of error is less than or equal to the expected value.
e LS-SVM algorithm, which is upgraded and improved from the SVM algorithm, can simultaneously deal with linear and nonlinear multivariate calibration problems.
ere are three main types of kernel functions of LS-SVM: linear kernel function, polynomial kernel function, and radial basis kernel function. Compared with the linear kernel function and the polynomial kernel function, the radial basis kernel function can reduce the computational complexity of the training process better. It also deals with the nonlinear relationship between spectral data and truth value. e PLS is a multivariate correction algorithm, and it is established based on the characteristics of principal component analysis and multiple regression [13]. e performance of PLS is evaluated from two aspects of accuracy and linearity. Its accuracy can be assessed by the root mean square error of prediction (RMSEP) [14]. e formula of PLS is as follows: where the T and U are the characteristic matrices of the spectral matrix X and the concentration matrix Y, respectively, and P and Q are the loading matrices of the spectral matrix X and the concentration matrix Y. E X and E Y are the fitting residual items of the spectral matrix X and the concentration matrix Y. en, the linear regression of sum of PLS is done: where B is the regression coefficient matrix. Finally, the predicted value of concentration is obtained by the following formula: Partial least square discriminant analysis (PLS-DA) is a linear discriminant analysis algorithm based on the PLS regression. It is a management method for classification purposes and explains the maximum difference between defined sample groups.    absorption strength. In this paper, the absorption coefficient in 0.5-2 THz is selected for modeling.

Feature Information Extraction.
When the UVE method is used to select the spectral variables, a group of random noises is introduced, in which the number of random noises is the same as the number of the spectrum variables. e result of UVE is shown in Figure 3, which is the spectral variables on the left side and is a computer-generated random noise on the right side, the value of ordinate is the stability of the spectrum index, the absolute value of ordinate is more excellent, the model is impacted bigger, and the importance of horizontal ordinate is the corresponding serial number of the spectrum and random noise; the two dashed lines are the thresholds selected by UVE, the variables outside the entries are retained, and the variables inside the two threshold lines are eliminated. After selection, 80 variables are obtained, and then the variables are inputted into the BPNN to establish a model. e SPA algorithm selects the spectral variables in 0.5-2.0 THz, and the number of variable selection is set from 10 to 100. As shown in Figure 4(a), when the number of variables is more than 53, the value of RMSE is almost constant small, so the 53 variables by SPA are appropriate. e distribution of SPA variable selection results is shown in Figure 4(b).

BPNN Model.
e feature information selecting by UVE is imported into BPNN. After several adjustments, the optimal number of hidden layer nodes is 5 in the BPNN model. e classification threshold is 0.5, and the four categories are Panax notoginseng powder, Panax notoginseng powder adulterating with zedoary turmeric powder, Panax notoginseng powder adulterating with wheat flour, and Panax notoginseng powder adulterating with rice flour. e adulterating samples are all powders with the percentage of concentration ranging from 5% to 60%. As shown in Figure 5(a), when the seventh generation is run in the BPNN model, the local extreme value appears, and the RMSE is 0.2323. is shows that the model obtained has obtained the best performance. e prediction set results of the BPNN model are shown in Figure 5(b). From Figure 5(b), we know that except for the prediction sample of Panax notoginseng powder, a small amount of the others are predicted wrong. In the prediction sample of adulterating zedoary turmeric powder, 1 sample is misjudged as adulterating with wheat flour, and 1 sample is misjudged as adulterating with rice flour. In the prediction sample of Panax notoginseng powder adulterating with wheat flour, three samples are misjudged as adulterating with zedoary powder. In the prediction sample of Panax notoginseng powder adulterating with rice flour, 1 sample is misjudged as adulterating with wheat flour, and the overall prediction set accuracy of the model is 95%.
e results show that the THz-TDS spectral technology can identify similar substances adulterating in Panax notoginseng powder with UVE-BPNN. e variables selected by SPA are input into BPNN. As shown in Figure 6(a), when the tenth generation is run in the BPNN model, the local extreme value appears, and the RMSE is 0.1218. e prediction set results of the BPNN model are shown in Figure 6(b). From Figure 6(b), we know that a small amount of the others are predicted wrong. In the prediction sample of adulterating with zedoary turmeric powder, two samples are misjudged as adulterating with rice flour. In the prediction sample of Panax notoginseng powder adulterating with wheat flour, 1 sample is misjudged as adulterating with rice flour, and 1 sample is misjudged as  adulterating with zedoary turmeric powder. In the prediction sample of Panax notoginseng powder adulterating with rice flour, 1 sample is misjudged as adulterating with wheat flour, and two samples are misjudged as adulterating with zedoary turmeric powder. In the prediction sample of Panax notoginseng powder, two samples are misjudged as adulterating with wheat flour, and the overall prediction set accuracy of the model is 92.5%. Figure 7 shows the average absorption coefficients of different adulterated mass fraction concentrations at 0.5-2.0 THz, changing with the frequency in 0.5-2.0 THz. Figure 7

LS-SVM Model.
In this paper, the LS-SVM algorithm analyzes the concentrations of three types of adulteration substances quantitatively. e adulteration concentrations of samples are ranging from 5% to 60%. According to the adulteration types, the samples are randomly divided into the modeling set and the prediction set at a ratio of 3 : 1, and the quantitative analysis models of the adulteration of Panax notoginseng powder are established with the kernel of radial basis function (RBF) and the kernel of linear function (LIN). Figure 8(a) shows the scatter diagram of the LS-SVM prediction model of Panax notoginseng powder adulterating with zedoary powder. e accuracy of the two types of kernel functions is very close; when the RBF is used as the input, RP is 0.9015 and RMSEP is 0.0723; when the linear kernel function is taken as the input function, Rp is 0.9012 and RMSEP is 0.0724. Figure 8 plots of the LS-SVM prediction model of Panax notoginseng powder adulterating with wheat flour and Panax notoginseng powder adulterating with rice flour, respectively; from Figure 8(b), we know when the input function is the RBF, the higher prediction accuracy is obtained; the Rp is 0.9306, and RMSEP is 0.0677; from Figure 8(c), we know when the RBF is used as the input function, the prediction model result is slightly better; the R is 0.9363, and the RMSEP is 0.0619. e prediction results of the LS-SVM quantitative model are shown in Table 2. It can be seen from the table that when the Panax notoginseng powder adulterating with zedoary turmeric powder, the result is better when the type of kernel function is LIN. To detect Panax notoginseng powder adulterating with wheat flour and rice flour, the best results are obtained when the input kernel function is RBF.

PLS Model.
In establishing the PLS quantitative model, it is crucial to select the appropriate number of principal factors. If the number of main factors is too large, more useless information is retained, and the model's accuracy is affected; if the number of principal factors is too tiny, some critical spectral information is ignored, and the accuracy of the model is affected [15,16]. e root mean square error of calibration (RMSEC), RMSEP, the correlation coefficient of calibration (R C ), and Rp are used to assess the model. e absorption coefficient spectra of the samples are input to the PLS to establish the quantitative analysis model of PLS. As shown in Figure 9(a), the optimal number of principal factors is 3; as shown in Figure 9(b), the results of the PLS   model of Panax notoginseng powder adulterating with zedoary powder are bad, the Rp is 0.6328, and the RMSEP is 0.13241. As shown in Figure 9(c), the optimal number of principal factors is 8; as shown in Figure 9(d), the results of the PLS model of Panax notoginseng powder adulterating with rice flour are good, the Rp is 0.9424, and the RMSEP is 0.0601. As shown in Figure 9(e), the optimal number of principal factors is 9; as shown in Figure 9(f), the results of the PLS model of Panax notoginseng powder adulterating with wheat flour are good, the Rp is 0.9047, and the RMSEP is 0.0771. Table 3 shows the results of the PLS quantitative analysis model for three types.
e results show that the PLS is feasible for quantitative identification of Panax notoginseng powder adulterating with rice flour and wheat flour. e correlation coefficients of the modeling set and prediction set of the PLS model are above 0.9. However, the detection of Panax notoginseng powder adulterating with zedoary turmeric has poor results. e correlation coefficients of the modeling set and the prediction set are low, and the RMSE values are significant.

Model Evaluation
In qualitative analysis, the classification accuracy of the prediction set is used to evaluate the model, the BPNN models combining with UVE and SPA are established, respectively, and high classification accuracy can be obtained. It is found that the UVE-BPNN model, whose classification accuracy is 0.95, is better. In quantitative analysis, the model is evaluated by R and RMSE of the model; the value of R is higher. e value of RMSE is smaller, and the accuracy of the model is more increased. e value of RMSEC is closer to the value of RMSEP, and the stability of the model is better. e quantitative analysis modes of the adulteration of Panax notoginseng powder are established by LS-SVM and PLS. e best result is obtained by using LS-SVM under LIN in the quantitative analysis of Panax notoginseng powder adulterating with zedoary turmeric powder. e value of Rp is 0.9015, and the value of RMSEP is 0.0723. In the quantitative analysis of Panax notoginseng powder adulterating with wheat flour, the best result is obtained by using LS-SVM under RBF. e value of Rp is 0.9315, and the value of RMSEP is 0.0677. In the quantitative analysis of Panax notoginseng powder adulterating with rice flour, the best result is obtained by using PLS. e value of Rp is 0.9424, and the value of RMSEP is 0.0601.

Summary
e qualitative and quantitative analysis of the impurity of similar substances in Panax notoginseng powder is conducted basing on THz-TDS. By comparing the spectra of the same adulteration samples with different concentrations and the spectra of varying adulteration samples, it is found that the THz spectra of the samples show significant differences for the same adulteration samples with different concentrations, and the spectral information of samples with varying types of adulteration also shows significant differences. In the qualitative analysis of three different types of adulteration, this UVE and SPA are adopted to extract feature information, and then the BPNN qualitative analysis models are established, respectively. e model result shows that the classification accuracy of the UVE-BPNN qualitative model prediction set is 95%, and the classification accuracy of the SPA-BPNN qualitative model prediction set is 92.5%. Data Availability e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.