Research on Partial Least Squares Method Based on Deep Confidence Network in Traditional Chinese Medicine

Partial least squares method has many advantages in multivariate linear regression modeling, but its internal cross-checking method will lead to a sharp reduction of the principal component, thereby reducing the accuracy of the regression equation, and the selection of principal components about the traditional Chinese medicine data is particularly sensitive. This paper proposes a kind of partial least squares method based on deep belief nets. This method mainly uses the deep learning model to extract the upper-level features of the original data, putting the extracted features into the partial least squares model for multiple linear regression and evading the problem that selects the number of principal components, continuously adjusting the model parameters until satisﬁed well-pleased accuracy condition. Using Dachengqitang experimental data and data sets in the UCI Machine Learning Repository, the experimental results show that the partial least squares analysis method based on deep belief nets has good adaptability to TCM data.


Introduction
e mechanism of treating diseases with traditional Chinese medicine is that multiple components of TCM correspond to multiple targets of diseases, so there are many related variables in TCM data. In the dose-effect relationship analysis of TCM prescriptions, the content of TCM prescriptions should be used to analyze the corresponding curative effects. To analyze the effects of independent variables on dependent variables, how to accurately select the number of independent variables is the key to establish a good dose-effect relationship model to achieve the goal of the better dose-effect analysis.
Partial least squares (PLS) [1,2] is a multivariate statistical data analysis method, which integrates the basic functions of principal component analysis(PCA), canonical correlation analysis (CCA), and multiple linear regression (MLR) analysis. PLS simplifies the data structure by principal component extraction, selects the first several principal components by truncation, and only uses these principle components to obtain a model with better prediction performance. If the subsequent principal components can no longer provide more meaningful information for the interpretation of the model, adopting too many principal components will instead reduce the precision of regression fitting. erefore, the selection of the number of principal components is particularly important. However, there are many related variables in TCM data. e cross-validation method will drastically reduce the number of principal components and reduce the accuracy of the regression equation. Deep belief nets (DBN) [3] are a multilayer neural network that fit the input data as much as possible. ey extract the upper-level features of the original data through deep learning of the existing model, avoid the problem of selecting the number of principal components, convert the original m-dimensional input data into n-dimensional output data through the deep network, and the output data are another deep nonlinear expression of the original input data. In 2011, Zhou [4] proposed embedding the fuzzy neural network model into the iterative form of partial least squares, which avoids the problem of principal component selection while achieving the nonlinear mapping effect, but the model results are easily affected by membership functions. In 2013, Qin [5] proposed a kernel partial least squares method, which combines intrinsic dimension estimation with the kernel partial least squares method and maps data to the high-dimensional linear space by the kernel function. Although the algorithm can avoid the selection of principal components and the nonlinear structure contained in the reaction sample data, the selection of kernel function is extremely difficult. In 2017, Zhu and JianqiangDu [6] proposed a partial least squares method integrated with the restricted Boltzmann machine (RBM-PLS), which extracts the required upper-layer features through the neural network structure, avoiding the problem of principal component selection, but the model results are affected by the initial values.
erefore, this paper integrates the deep belief nets into the partial least squares, extracts the upper-level features of the original data through the deep learning model, avoids the problem of selecting the number of principal components, thus carries out multiple linear regression, and finally restores to the regression equation of the original variables.
rough the verification of the test data, the parameters of the deep learning model and the number of hidden layers are adjusted in turn, finally achieving the purpose of improving the accuracy of the regression equation. e algorithm not only retains PLS's characteristic of eliminating multiple linear correlations of traditional Chinese medicine data but also can establish a good regression model for data with a small sample size. While avoiding the problem of selecting the number of principal components, it makes up for the defect that traditional Chinese medicine data are particularly sensitive to the selection of principal components, thus establishing a regression model suitable for traditional Chinese medicine characteristics.

Partial Least Squares
Partial least squares [7] is a new multivariate statistical data analysis method. Different from traditional multivariate regression statistics, it mainly studies the regression modeling of multiple dependent variables to multiple independent variables [8]. Regression modeling can be carried out when the number of sample points is less than the number of variables or multiple correlations exist between variables, and the regression coefficient of each variable is easy to explain. e basic idea of partial least squares regression is that there is a set of independent variables X � (x 1 , x 2 , x 3 , L x n ) and dependent variables Y � (y 1 , y 2 , y 3 , L y n ), and the principal components (t 1 and u 1 ) extracted from independent variables and dependent variables are required to carry the variation information of their respective data tables to the maximum extent-var(t 1 ) ⟶ max , var(u 1 ) ⟶ max, they are required to have the greatest correlation degree r(t 1 , u 1 ) ⟶ max, multiple linear regression is then performed, and if the regression equation reaches satisfactory accuracy, the algorithm is terminated; otherwise, the second round of principal component extraction is continued from the residual information, thus reciprocating until satisfactory accuracy is reached.

Deep Belief Nets
Hinton and Salakhutdinov [9] proposed that restricted Boltzmann machines (RBMs) can be stacked greedily for training in 2016, thus forming deep belief nets (DBN). DBN are a probability generation model, which make the whole neural network generate training data according to the maximum probability through the weights between neurons in the training network structure, thus forming high-level abstract features [10].
DBN consist of two parts of neurons, namely, visible layer neurons and hidden layer neurons.
e key component is RBM; feature extraction of input data can be realized by combining multiple layers of RBM. RBM is composed of two layers of neurons, visible layer and hidden layer, respectively. e specific structure diagram of RBM is shown in Figure 1.
Assuming that the RBM is n-layer, a DBN model that can extract features abstractly is constructed. v and h represent visible layer neurons and hidden layer neurons, respectively. e specific construction process is as follows: firstly, the weight and offset of the first trained RBM are fixed, the state of its hidden layer is taken as the input of the second RBM, the second RBM is trained and stacked on the first RBM, and so on to repeat the above process. e final model is shown in Figure 2.
Since DBN are formed by stacking multiple RBM, the main process of training DBN is the process of training RBM. e number of neurons in the visible layer is defined as m, the number of neurons in the hidden layer is n, the state vector of the visible layer neurons is V, and the state vector of the hidden layer neurons is H. For a given set of state vectors, the energy of an RBM can be expressed by the following function [11]: Among them, the parameter of the model is θ � a i , b i , W ij , a i is the offset of the visible layer neurons i,b i is the offset of the hidden layer neurons j, and W ij is the weight between the visible layer neurons and the hidden layer neurons j.
In the process of RBM training, the values of the parameter θare obtained through continuous iteration (the maximum parameter is obtained by using the stochastic gradient descent method), and the end of the iteration is determined by the effect of the fitted training data.
For trained DBN, under the condition of known parameter θ � a i , b i , W ij , preprocess the original data to obtain V � (v 1 , v 2 , . . . , v n ) and then put the data into the model, namely, 2 Discrete Dynamics in Nature and Society e original data are mapped into hidden layer data H � (h 1 , h 2 , ..., h m ) through sigmoid function. In this way, the original data can be transformed into another deep expression, which achieves the purpose of extracting upperlevel features.

Partial Least Squares Construction Based on DBN
where h represents the number of components, PRESS h represents the sum of squared prediction errors of equation Y, and SS h represents the sum of squared errors of Y fitted with h components. Obviously, the value of PRESS h is greater than SS h , but less than SS h−1 . In fact, the square roots of the three are not much different, making the condition of ������ � PRESS h ≤ 0.95 ���� � SS h−1 very easy to meet. is leads to a sharp decrease in the number of selected components, thereby reducing the accuracy of the regression equation. e feature extraction of the DBN network is mainly divided into two aspects: pretraining and parameter tuning. First, pretrain the network. RBM is the basic structure of the DBN network, so training is performed layer by layer in RBM. e standardized data set is used as the input of the underlying RBM, and the first layer feature h 1 of the data set is obtained through the previously trained parameter θ. Taking feature h 1 as the input data of the next layer of RBM, we can get feature h 2 which is more abstract than feature h 1 .
rough iterative abstraction, the final feature h n is obtained. At this time, h n integrates the different features abstracted by the underlying RBM, making the features more representative of the variables in the original data.
At this time, no matter whether there is a strong correlation between the original variables, different features can be extracted through the DBN network. Furthermore, according to the different functions of variables, they can be combined into more abstract and upper-level features that can represent different types of variables. In the end, the contribution of each independent variable to the experiment of the traditional Chinese medicine is brought into full play. erefore, the DBN-PLS method proposed in this paper is to use DBN to replace the main component extraction part of PLS. e upper-level features of the original data extracted by DBN are used as the principal components, then the PLS external model is used for regression modeling, and finally the original variable regression equation is restored. e verification of the test data, in turn, adjusts the parameters of the deep learning model and the number of hidden layers. e goal of improving the accuracy of the regression equation is finally achieved.
Let the independent variable set X � (x 1 , x 2 , x 3 , L x p ) and dependent variable set Y � (y 1 , y 2 , y 3 , L y q ); X are matrices of n * p, and Y are matrices ofn * q.
e specific construction process is as follows: (1) Data preprocessing: standardized pretreatment of X and Y, respectively, to obtain E 0 and F 0 .
.., F q ) variables are, respectively, put into the DBN model for training, and the processing steps are as follows: (1) According to the number of characteristic attributes of the independent variable E 0 � (E 1 , E 2 , E 3 , ..., E p ), the number of visible layer neurons in the DBN is determined, p. Since the main purpose is to reduce the dimension of characteristics, the number of hidden layers is generally smaller than the number of neurons in the visual layer. Here, the number is p 1 (p 1 < p).
(2) Random initialization parameter θ � a i , b j , w ij . e offset vectors of the visible layer and the hidden layer are a � a 1 , a 2 , ..., a p and b � b 1 , b 2 , ..., b p1 }, respectively, and the weight matrix is as the input of the visual layer, train each RBM layer by layer, the hidden layer of the previous RBM is the visual layer of the next RBM and the output of the prethe input of the next RBM. In the training process, it is necessary to fully train the RBM of the previous layer before training the RBM of the current layer until the last layer. (4) After calculating the principal components in step 3, we put the principal components into the partial least squares external model for multiple linear regression analysis. Finally, the coefficient is reduced to a multiple linear regression equation of Y with respect to X. According to two evaluation indexes of

Hidden layer
Visual layer Discrete Dynamics in Nature and Society root mean square error (RMSE) and redetermination coefficient (R 2 ), it is judged whether the model is satisfied with the precision requirement at this time.
If the requirement is met, the algorithm is stopped. If not, the superparameters of the model are continuously adjusted until the requirement is met.
As shown in Algorithm 1, the DBN-PLS model constructed in this paper is mainly composed of two parts: feature extraction and nonlinear PLS regression. Among them, DBN use the characteristics of their interlayer transmission to automatically extract features from the original data, and PLS is responsible for the regression of the extracted features. e model retains the characteristic that PLS can solve the problem of multiple correlations of traditional Chinese medicine data, while also avoiding the defect of cross-validity of PLS. It first performs canonical correlation analysis and multiple linear regression and then restores the regression equation of the original variables. rough the verification with the test data, the deep learning layer and parameter settings are adjusted in turn. Finally, the purpose of improving the accuracy of the final regression equation is achieved. e overall structural model is shown in Figure 3.

Experiment Results and Analysis
e experimental data in this paper are mainly from Dachengqitang experimental data (DCQT) of the Key Laboratory of Jiangxi University of Traditional Chinese Medicine and Housing, AirQuality, and CBM [12] on UCI data sets.

Experimental Data Description.
e experimental data of Dachengqitang consist of 9 samples, which are the effects of active ingredients on pharmacological indexes in rat plasma at 9 different doses. Independent variables are the main active components in rat plasma, namely, emodin, rhein, chrysophanol, aloe-emodin, emodin methyl ether, magnolol, honokiol, hesperidin, and hesperetin; the dependent variables are the pharmacological indexes examined, namely, first defecation time, motilin, and vasoactive intestinal peptide. Some experimental data are shown in Table 1.
For UCI data, Housing, AirQuality, and CBM data sets are selected, with sample sizes of 506, 9,357, and 11,934, respectively. See http://archive.ics.uci.edu/ml/ for a detailed description.

Data Standardization.
Data normalization (normalization) processing is a necessary task when to start data mining. Different data have different dimensions and dimension units, and this situation will affect the results of data analysis. To eliminate the dimensional impact between data and the data itself, data standardization is needed to resolve the comparability of data metrics. In order to comprehensively compare and evaluate, the above statistical data are separately standardized. After the original data are standardized, the indicators reach the same level, which is suitable for comparative evaluation [4]. e standardized method used in this paper is Z-score standardization, also called standard deviation standardization. e processed data conform to the standard normal distribution, that is, the mean is 0, and the variance is 1. Its conversion function is where μ � (1/n) n i�1 x i represents the mean of each feature of data X and σ � ������� � (1/n − 1) n i�1 (x i − μ) 2 represents the standard deviation of each feature of data X.
Taking Dachengqitang data as an example, the data set is subjected to data standardization processing. e results are shown in Table 2.   Step 1 preprocesses the data to obtain (E 0 , F 0 ) Step 2 deep belief nets (DBN) Initialize model parameters θ � a i , b i , W ij Layersize � 1 hidden_layers_sizes � [4,4,4] While Whether the number of RBM layer size reaches the precision condition while Whether the number layersize of each layer of neurons reaches the accuracy condition for z in layersize Calculating the probability that hidden layer neurons are activated P z (h i |v i ) Take Gibbs sampling to extract a sample: v i+1 ∼ P z (v i+1 |h i ) Reconstruct the visual layer with h i is used to calculate the probability of activated hidden layer neurons Take Gibbs sampling to extract a sample: v i+1 ∼ P z (v i+1 |h i ) V i+1 is used to calculate the probability of activated hidden layer neurons Step 3 e characteristics extracted by independent variables and dependent variables are calculated by the DBN model, respectively.
T Discrete Dynamics in Nature and Society

Auxiliary Analysis.
e T 2 ellipse-assisted analysis technique finds the singular points in the sample by analyzing the contribution rate of the sample points to the components. We define the contribution rate T 2 hi of the i-th sample point to the h-th component t h to be T 2 where t hi is the projection value of the ith sample point on the h-th main axis a h and s 2 h is the variance of the component t h . en, the cumulative contribution rate T 2 i of the i-th sample point pair component i ≥ (m(n 2 − 1)/n 2 (n − m))F 0.05 (m, n − m), it is considered that, at the 95% test level, the cumulative contribution rate of the i-th sample point to the component t 1 , t 2 , . . . , t m is too large, and the sample point i is called a singular point.
In particular, when m � 2, that is, Equation (7) represents an area outside the ellipse (t 2 1i /s 2 1 ) + (t 2 2i /s 2 2 ) � c range. In other words, the singular point is distributed outside the ellipse range; then, on the t 1 /t 2 -plane coordinate system, the T 2 ellipse and the scatter plot of the data set can be drawn.
ose points that are outside the T 2 ellipse can be called singular points.
From the analysis of the four sets of data, from equation (7), when the extracted component m is 2, the right side of the inequality is an ellipse, and we can make an ellipse on the t 1 /t 2 planar graph. us, the existence of singular points can be visually reflected in the graph. erefore, in this paper, two principal components are extracted separately for the four models, and the singular points in the sample points in the model are discriminated according to discriminant conditional formula (7). Four T 2 ellipses were created using SIMCA-P software to show the presence of sample points, as shown in Figures 4-7.
From Figures 4-7, it can be concluded that there are no singular points in the DCQT data set, and there are 22, 544, and 645 singular points in the Housing, AirQuality, and CBM data sets, accounting for 4.3%, 5.8%, and 5.4% of the total samples, respectively. e sample points concentrated outside the ellipse take values away from the average level of the sample points, the so-called singular points. Find out these singular points and eliminate them, and then use the processed data to optimize the experimental analysis.
ere are also some common methods in the singularity elimination phase of partial least squares. e direct decomposition algorithm (TDDA) [13] directly decomposes the multidimensional data set according to all its dimensions to obtain several one-dimensional data sets. erefore, the multidimensional data are mapped to the one-dimensional space, and the purpose of dimensionality reduction is realized. e multidimensional scaling method (TMSM) [13] is a graphical method for visually representing a research object in a low-dimensional space based on the similarity between the research objects and performing clustering or dimensional inclusion analysis. e T 2 ellipsoid method [14] is a modification of the T 2 ellipse analysis technique, extending the two-dimensional space T 2 elliptical plan to the three-dimensional T 2 ellipsoid and then using the T 2 ellipsoid to

6
Discrete Dynamics in Nature and Society identify specific points. Taking the Housing data set as an example, the above method is used to identify the specific point of the data set. Figure 8 shows the graphical results of the T 2 ellipsoid method. e results of all methods are shown in Table 3.
Obviously, by analyzing the comparison results, it can be found that the Housing data set after T 2 ellipse cleaning has the highest fitting value of PLS. TDDA may cause omissions when detecting the specific points of multidimensional data, so some specific points will be ignored; when using TMSM, the effect of each dimension on the Euclidean distance is equal, but in practice, the fluctuation range of each dimension is not the same, and the TCM data are highly correlated, so it is not too suitable for this; the T 2 ellipsoid method detects some normal data as a specific point, which leads to a decrease of redetermination coefficient (R 2 ). From there, it can be judged that the T 2 ellipse analysis method has a better performance in the recognition of specific points.

Experimental Process and Result Analysis.
In order to verify the feasibility and effectiveness of the DBN-PLS method, three deep belief net models are, respectively, set to train the number of hidden layers. e number of hidden layers in the first DBN model is set to 3. e number of hidden layers in the second DBN model is set to 4. e number of hidden layers in the third DBN model is set to 5. e number of neurons in the hidden layer of the above three DBN models depends on the specific situation. Each DBN model is combined with PLS, and the above three situations are analyzed and compared with the original partial least squares (PLS) and the partial least squares (RBM-PLS) integrated into the restricted Boltzmann machine in the literature [12]. Four groups of experimental data are used for comparison and verification. e specific description of the data set is shown in Table 4.
In the specific process of the experiment, the model is optimized by adjusting the model parameters, and the effects    Discrete Dynamics in Nature and Society of several methods are compared at the same level of the learning and training set. Root mean square error (RMSE) and redetermination coefficient (R 2 ) were investigated, respectively. e experimental results are shown in Table 5.
According to the experimental results in Table 5, RMSE and R 2 of DBN-PLS are better than PLS and RBM-PLS. When the number of hidden layers (i.e., the number of RBM overlays) of the DBN-PLS method is 3, the effect is the best, the number of hidden layers increases continuously, and the effect also decreases gradually. e specific analysis process is as follows.
In the experimental data of Dachengqitang, the RMSE of PLS is much larger than that of RBM-PLS and DBN-PLS, and the RMSE of RBM-PLS is larger than that of DBN-PLS. When the number of hidden layers of the DBN-PLS method increases continuously, the RMSE increases continuously, that is, the effect decreases continuously. e RMSE of the five methods is 0.9687, 0.6934, 0.2797, 0.3420, and 0.3479, respectively. e R 2 value of DBN-PLS is greater than PLS and RBM-PLS, and the R 2 value of RBM-PLS is greater than PLS. When the number of hidden layers of the DBN-PLS method increases continuously, the R 2 value decreases continuously, that is, the effect decreases continuously. e R 2 values of the five methods are 0.6942, 0.7962, 0.9421, 0.9135, and 0.9105, respectively. When the number of RBM superposed by DBN-PLS is 3, 4, and 5, respectively, the values of RMSE and R 2 are 0.2797 and 0.9421, 0.3420 and 0.9135, and 0.3479 and 0.9105, respectively. Based on the above two evaluation indexes, it can be concluded that DBN-PLS is better than RBM-PLS and PLS for the experimental data of Dachengqitang, and RBM-PLS is better than PLS. When the number of RBM superposed by DBN-PLS is 3, the effect is most significant.
In the UCI standard data set, Housing is a medium data sample, while AirQuality and CBM are large data samples. Among all kinds of data, DBN-PLS has the best effect. In Housing and AirQuality data sets, RBM-PLS has the same effect as PLS, even slightly worse than PLS. For example, in the Housing data set, RMSE of RBM-PLS is larger than PLS, 41.2391 and 40.9841, respectively. Similarly, the R 2 value of RBM-PLS is less than PLS, 0.1813 and 0.1987, respectively. In the CBM data set, RMSE of RBM-PLS is less than PLS, 0.0165 and 0.0634, respectively. In contrast, the R 2 value of RBM-PLS is less than PLS, 0.5010 and 0.6935, respectively. e above effect is produced because the results of the RBM-PLS model are affected by the initial values. Different initial values have different effects, which will produce slightly worse effects than PLS. To sum up, DBN-PLS has the most significant effect on the above two evaluation indexes, whether it is the experimental data set of Dachengqi decoction or the UCI standard data set. Especially, when the number of RBM in the model is 3, the effect is the best. On the contrary, the superposition number increases, and the effect decreases. is shows that when DBN-PLS is used for feature extraction, the superposition number of RBM is 3, and the effect is the best.
To display the experimental results more intuitively, graphs are drawn to reflect the fluctuation of root mean square error (RMSE) and redetermination coefficient (R 2 ). Due to the different orders of magnitude of RMSE and R 2 of each data set, to compare the fluctuation of RMSE and R 2 of each data set in different methods conveniently, the experimental result data are centrally processed and mapped to [0,1], Figures 9 and 10 are drawn.
From Figures 9 and 10, it can be seen more intuitively that DBN-PLS has significantly improved effect in various indexes, and the effect is better than RBM-PLS and PLS. When the number of hidden layers of the DBN-PLS method is 3, the effect is the best. e number of hidden layers increases continuously, and the effect of each index shows a slow downward trend. In the DCQT data set, RBM-PLS has better performance than PLS, while in the UCI standard data set, RBM-PLS has slightly worse performance than PLS. is is because the results of the RBM-PLS model are affected by the initial values. e initial values selected are different, and the effects are different.
To sum up, among the above four groups of experiments, DBN-PLS has the best effect of multiple linear regression, which indicates that the fitting effect is better. As the number of hidden layers in DBN-PLS increases, the effect weakens, indicating that when DBN-PLS is used for feature extraction, the best effect is when the number of RBM overlays is 3. In different data sets, the effects of RBM-

Conclusions
e TCM data are modeled based on the partial least squares method. In addition to the same process as the modeling methods in other fields, there are also features such as large data volume, multiple correlations, nonlinearity, and random distribution of data. e main performance is ① Modeling samples selected for massive data affect the accuracy of the PLS model. ② Partial least squares method has limited effect in identifying and eliminating abnormal observation data records. ③ Limitations of principal component extraction methods. erefore, the work of this paper is improved based on the above partial least squares method: (1) e experimental data of the traditional Chinese medicine: there are often many variables related to the dependent variable, and even the observation data of some independent variables are costly, but they can now be collected by leading-edge equipment. If such variables are included in the regression model, not only will the calculation amount be increased and the application cost of the model be increased but also the sample data are multidimensional, that is, a sample is characterized by multiple features. e dimensions and numerical values of these features are different, making the model unstable. erefore, the work of this paper is firstly through the Z-score standardization method for data standardization processing compared with the min-max standardization method [15]; it is not necessary to redefine the values of max and min so that different features have the same scale. us, when learning parameters, different features have the same degree of influence on the parameters.
(2) High-quality data from drug trials are the basis for building analytical models. e experimental data are not only affected by the statistical distribution characteristics of the system parameters but also affected by the operating habits of the experimental personnel so that the number of occurrences of various dispensing groups is extremely uneven, and the observation data cannot contain large errors; otherwise, the result is not reliable. erefore, the T 2 ellipse is used to observe the distribution of the sample points and the similarity structure on the t 1 /t 2 plan and to find those singular points whose values are far from the sample points combined with the average level. Achieve the purpose of eliminating abnormal data. Compared to TDDA, TMSM, and T 2 ellipsoid method mentioned by Sun Shuwei [16] and Cui Lizhen [17], the T 2 ellipse method can better eliminate noise and improve the value of the redetermination coefficient of PLS.
(3) In the partial least squares algorithm, if the number of principal components selected to participate in the regression is too small, some useful information in the original independent variable matrix is ignored,   which affects the fitting ability and prediction accuracy of the model; if the number of selected principal components is too large, although the fitting accuracy of the model can be improved, overfitting may occur due to introduction of irrelevant noise, resulting in a decrease in prediction accuracy. e most common method for extracting the number of components by partial least squares is to use the cross-validation (CV) [18] method to investigate the change of model prediction ability after adding new components, but the existing CV is insufficient. If the number of verification sets is too small, it is easy to overfit; the criteria for determining the optimal number of verification sets need to be given.
is paper proposes a partial least squares method based on DBN, which makes full use of the joint probability density distribution in the DBN generation model to learn the upper-layer features of the data and avoids the selection of the number of principal components. Combined with the characteristics of PLS, in the case of small sample size, it is still possible to regression modeling and maximize the relationship between independent variables and dependent variables and give full play to the advantages of the algorithms themselves. Experiments on Chinese medicine data and UCI standard data sets show that DBN-PLS significantly improves the accuracy of the regression equation and the expression of nonlinear structures. Compared with the conventional partial least squares method, it is more in line with the theoretical value and better solves the problem of determining the number of principal components in the partial least squares method.
Data Availability e data in this article can be shared and used free of charge, but the data not listed in this article come from the Key Laboratory of traditional Chinese medicine preparation of Jiangxi University of Traditional Chinese Medicine, which involve patient privacy and business secrets and cannot be shared and used.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.