Assessing Dry Weight of Hemodialysis Patients via Sparse Laplacian Regularized RVFL Neural Network with L2,1-Norm

Dry weight is the normal weight of hemodialysis patients after hemodialysis. If the amount of water in diabetes is too much (during hemodialysis), the patient will experience hypotension and shock symptoms. Therefore, the correct assessment of the patient's dry weight is clinically important. These methods all rely on professional instruments and technicians, which are time-consuming and labor-intensive. To avoid this limitation, we hope to use machine learning methods on patients. This study collected demographic and anthropometric data of 476 hemodialysis patients, including age, gender, blood pressure (BP), body mass index (BMI), years of dialysis (YD), and heart rate (HR). We propose a Sparse Laplacian regularized Random Vector Functional Link (SLapRVFL) neural network model on the basis of predecessors. When we evaluate the prediction performance of the model, we fully compare SLapRVFL with the Body Composition Monitor (BCM) instrument and other models. The Root Mean Square Error (RMSE) of SLapRVFL is 1.3136, which is better than other methods. The SLapRVFL neural network model could be a viable alternative of dry weight assessment.


Introduction
Fluid overload in patients with chronic renal failure is closely related to poor cardiovascular outcomes [1,2]. Maintenance of hemodialysis (HD) is the main method for patients with renal failure [3]. However, the accurate assessment of body water volume is still a concern [4]. At present, dry weight has been used as an important indicator to assess the homeostasis of fluids in hemodialysis patients. Medical staff can use the patient's dry weight to estimate the amount of water needed for dialysis during hemodialysis. The conventional clinical-based dry weight assessment method is timeconsuming and labor-intensive [1]. There are already some methods based on bioelectrical impedance analysis (BIA) [5] to determine dry weight, including body composition monitor (BCM) [6] and lung ultrasound (LUS). However, all the above methods require special instruments and pro-fessional technicians to complete. Medical staff can use some clinical data to build predictive models [7] to accurately assess dry weight. Currently, machine learning (ML) or deep learning has solved many common clinical problems in medicine, such as brain diseases [8][9][10], cancer analysis, and diabetes.
In our previous research, a Multiple Kernel Support Vector Regression (MKSVR) [39] predictor was proposed to assess the dry weight and obtain good predictive performance. Inspired by the previous work and baseline Random Vector Functional Link (RVFL) network [40], we propose a new dry weight assessment model, called Sparse Laplacian regularized RVFL neural network with L 2,1 -norm (SLapRVFL), which considers the topological relationship between samples and more sparse connections between the input layer and the hidden layer.

Materials and Methods
2.1. Materials. This work collects demographic and anthropometric data and bioimpedance spectroscopy (BIS) from historical data (2018-9 to 2019-9) from Wuxi people's hospital and the northern Jiangsu people's hospital. This study has been approved by the ethics committees of the hospitals (Nos. KYLLKS201813 and 2018KY-001). The collected patient data meet the following requirements: age greater than 18 years; ESRD for more than three months and maintenance hemodialysis [41]; no heart failure, no metal implants, no pregnancy, no disability, no infection, and no edema and other diseases; and hemodialysis treatment 3 times a week, 4 hours each time. Finally, we obtain a data set of 476 hemodialysis patients. DW is the normal body weight after clinical diabetes. DW is obtained by a clinician under strict clinical supervision using a clinical scoring system (using trial and error method) [42,43].
We choose 7 features, including age, gender (binary feature), systolic blood pressure (SBP), diastolic blood pressure (DBP), body mass index (BMI), heart rate (HR), and years of dialysis (YD) to build our predictive model. Table 1 shows the information of the data set. BMI is measured before hemodialysis treatment.

2.2.
Methods. The baseline RVFL was proposed for regression or classification. The schematic diagram of RVFL is shown in Figure 1. The basic information of the patient is put into the RVFL neural network model for processing, and the predicted dry weight is the output.
Suppose, there are N training samples with fx i , y i g, i = 1, 2, ⋯, N. The output value is y i ∈ R 1×c and the input data is x i ∈ R 1×d . d denotes the dimension of x i . As per Figure 1, RVFL randomly initializes all weights and deviations between the hidden layer and the input layer. These parameters are fixed during the training process and do not need to be tuned. There are connections between the output layer, input layer, and hidden layer. This part of the weight needs to be obtained by training RVFL. The output layer of RVFL is connected to both the input layer and the hidden layer, so as to ensure the nonlinear and linear relationships between the input and the output. The RVFL network with P hidden nodes are formulated as where β denotes the output weight matrix; H is the concatenated matrix, which combines the output of the hidden layer and the input layer; and Y denotes the label matrix. H and β can be represented as

BioMed Research International
In Equation (4), a j and b j are the weights and bias of the hidden and input layers. C and P are numbers of output and hidden layer nodes. In general, the activation function is a Gaussian function: gðxÞ = e −x 2 . The activation function has a nonlinear approximation effect. To consider the potential linear relationship between the input data and the output value, RVFL adds a direct connection weight between the input layer and the output layer. Therefore, RVFL is a model that contains both linear and nonlinear approximations to improve prediction performance. For optimal β, the RVFL can be formulated as a regularized least-squares: where λ is the parameter of regularization term. The solution of Equation (6) can be found by setting its gradient to 0: where I denotes the identity matrix. However, the RVFL network did not consider the topological relationship between samples. For the output node, it must be connected to both the input and the hidden layer. In order to further improve the robustness of RVFL, we propose Sparse Laplacian regularized RVFL neural network with L 2,1 -norm (SLapRVFL). The objective function is where L ∈ R N×N denotes the Laplacian matrix. λ 1 and λ 2 are the coefficients of Laplacian regularization the and L 21 -norm term, respectively. Laplacian regularization is used to indicate the potential manifold between samples. It can better describe the topological association between samples to improve the generalization ability of the model. Since the third term of kβk 2 2,1 is not diversified, we convert Equation (8) to where G ∈ R ðd+PÞ×ðd+PÞ denotes a diagonal matrix whose ithdiagonal element We take the derivative of the formula Equation (10) as Require: Training set fx i , y i g, i = 1, 2, ⋯, N, test set fx te j g, j = 1, 2, ⋯, M, the numbers of hidden layer nodes (P), the maximum number of iterations tmax, coefficients of λ 1 and λ 2 ; Ensure: The predictive values of fy te j g, j = 1, 2, ⋯, M (1) Randomly initializing all weights and deviations between the hidden layer and the input layer. Calculating the hidden layer output matrix H (training set)and Laplacian matrix L by Equations (2), (12), and (13); (2) Set t = 0, estimate the initial β 0 using Equation (7);

BioMed Research International
We use the baseline RVFL solution with Equation (7) as the initial β 0 . In addition, the Laplacian matrix can be calculate as where D is diagonal matrix, D ii = ∑ N j=1 S ij . Similarity matrix S is built by Radial Basis Function (RBF): The process of SLapRVFL is list in Algorithm 1.

Results
We test our model on the benchmark data set and obtain the optimal parameters of the predictor through crossvalidation. The SLapRVFL network is compared to other machine learning-based models. In addition, the body composition monitor (BCM) device (Fresenius Medical Care, Baden Humboldt, Germany) is also compared with the SLapRVFL network.
3.1. Evaluation Measurements. The 10-fold cross-validation (10-CV) is employed to evaluate the robustness of methods. Root Mean Square Error (RMSE), R square, correlation coefficient (R), Bland-Altman analysis, and Empirical Cumulative Distribution Plot (ECDP) [44] are all used in our study.
To evaluate the agreement of two different methods, the Bland-Altman analysis usually can obtain whether the two        BioMed Research International methods can be substituted for each other (equivalence). Evaluating the agreement of the two methods can answer the question, "Can these two methods replace each other?"

Selection of Optimal Parameters.
To get the optimal parameters of the predictive method, we obtain them through a grid search method. The parameters that need to be determined include the numbers of hidden layer nodes P , maximum iterations, and coefficients of λ 1 and λ 2 . For the numbers of hidden layer nodes P, we fix the iterations, λ 1 and λ 2 . Setting the maximum number as 50, λ 1 = 1 and λ 2 = 1. The value of P is from 10 to 140 with step of 10. The results are shown in Figure 2. From 10 to 100, the more neurons in the hidden layer, the lower the RMSE. Since then, RMSE has gradually increased. So, we get the lower RMSE under P = 100. Next, P = 100, λ 1 = 1, and λ 2 = 1. We gradually increase the number of iterations from 1 to 100 (shown in Figure 3). After the number of iterations reaches 10, the RMSE value drops to a minimum and slightly oscillates within a certain value. In our study, maximum number of iterations is 10.
Then, we use the better number of hidden layer nodes and iterations to search for the best λ 1 and λ 2 . The search range of parameters is from 2 −5 to 2 0 (with step of 2 0:5 ). Figure 4 shows the results of different parameters. When λ 1 and λ 2 are 2 −3 and 2 −2:5 , RMSE is the lowest.  Table 2, which shows that SLapRVFL achieves best performance of RMSE (1.3136). Although the ECDP median value (peak) of MKSVR (0.0082) is more close to zero, Figure 5 shows that SLapRVFL has the least bias and much less tails than MKSVR (smaller width). The RMSE of BCM is 1.9694, which is larger than SLapRVFL.

Bland-Altman Analysis.
Bland-Altman plot is a useful tool to evaluate the agreement between predictive methods and clinical DW. In Table 3 and Figure 6, SLapRVFL, MKSVR, LR, ANN (BP), MKRR, and BCM are analyzed via Bland-Altman difference plot. SLapRVFL achieves the smallest range of 95% confidence interval (-0.1133 to 0.2866) and standard deviation (2.2202). In addition, the number (ratio) of outside agreement interval for predictive models is all less than 24 (5%) predictive samples. These results of models are clinically acceptable. SLapRVFL achieves least number (20) of the outside agreement interval in Table 3. As shown in Figure 6, two red horizontal dotted lines (upper and lower) denote the upper and lower limits of the 95% agreement limit, respectively. The middle blue solid line is the average value of the difference (between measurement methods and clinical DW). While one measurement method and clinical method can be considered as a better agreement, they can be substituted for each other (equivalence). If 95% of the points of the data set are in the agreement range, the measurement method (predictive model) is clinically acceptable. The results of the evaluation show that SLapRVFL can help clinicians assess DW with low cost.

Discussion
Due to the limitations of clinical and BCM measurement (more time and cost), this study uses a machine learning method to assess the dry weight of hemodialysis patients. Based on the basic RVFL, we propose a sparse Laplace regularized RVFL network (SLapRVFL) model. SLapRVFL is compared not only with other machine learning methods (such as LR, MKRR, ANN with BP, and MKSVR) but also with BCM equipment (commonly used in hospitals). The RMSE and Bland-Altman analysis of the model are better than the BCM instrument. It is proven that the predictive model driven by data can provide reference for clinical dry weight assessment.
BCM requires the patient's information on weight (before hemodialysis) and height. It is a portable, inexpensive, and noninvasive technology that has been used to measure DW [45,46]. For the Bland-Altman analysis, SLapRVFL achieves the least number (20) of outside agreement interval. However, BCM has 30/476 (6.30%) points (ratio) of the outside agreement interval. Obviously, our method has better agreement with the clinical method.

Conclusions
To further improve the robustness of RVFL, we introduce sparse Laplacian regular term with L 2,1 -norm. In the training process, the graph topology information and the sparse weight matrix (output) are employed to improve the robustness of the RVFL. In fact, our work provides a new idea for assessing patients' dry weight. Not only that, in the fields of biology [47][48][49][50][51][52][53][54][55][56][57], pharmacy [58], and medicine [12,59,60], machine learning methods have helped solve many analysis tasks. In future research, we will consider collecting more samples, introducing more patient personal information, and building a predictor based on a deep learning model to more accurately assess the dry weight of hemodialysis patients.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval
This study had been approved by the ethics committee of the hospital (ethical approval Nos. KYLLKS201813 and 2018KY-001). The experimental protocol was established, according to the ethical guidelines of the Helsinki Declaration, and was approved by the Human Ethics Committee (Wuxi

Consent
Written informed consent for publication was obtained from all participants.

Conflicts of Interest
The authors declare that they have no conflict of interest.