Assessing the Adequacy of Hemodialysis Patients via the Graph-Based Takagi-Sugeno-Kang Fuzzy System

Maintenance hemodialysis is the main method for the treatment of end-stage renal disease in China. The Kt/V value is the gold standard of hemodialysis adequacy. However, Kt/V requires repeated blood drawing and evaluation; it is hard to monitor dialysis adequacy frequently. In order to meet the need for repeated clinical assessments of dialysis adequacy, we want to find a noninvasive way to assess dialysis adequacy. Therefore, we collect some clinically relevant data and develop a machine learning- (ML-) based model to predict dialysis adequacy for clinical hemodialysis patients. We collect 250 patients, including gender, age, ultrafiltration (UF), predialysis body weight (preBW), postdialysis body weights (postBW), blood pressure (BP), heart rate (HR), and blood flow (BF). An efficient graph-based Takagi-Sugeno-Kang Fuzzy System (G-TSK-FS) model is proposed to predict the dialysis adequacy of hemodialysis patients. The root mean square error (RMSE) of our model is 0.1578. The proposed model can be used as a feasible method to predict dialysis adequacy, providing a new way for clinical practice. Our G-TSK-FS model could be used as a feasible method to predict dialysis adequacy, providing a new way for clinical practice.


Introduction
Maintenance hemodialysis is the main treatment for end-stage renal disease in China. Adequate hemodialysis not only prolongs survival time [1][2][3] but also reduces dialysis complications, improves quality of life, and reduces mortality. Kt/V is the most commonly used indicator to assess the adequacy of hemodialysis. The British Society of Nephrology and the Kidney Disease Outcome Quality Initiative (K/DOQI) recommend a minimum Kt/V of 1.2. The Kt/V value needs to measure the BUN level (before and after dialysis) and is calculated by the Daugirdas formula (Kt/Vdau). This method requires repeated blood draws and evaluations, so it is difficult to frequently monitor the adequacy of dialysis. Currently, some clinical researchers used body monitor component (BCM) measurement to calculate the Kt/V value. However, the BCM technology requires special equipment, and the operation method has not yet formed a unified standard. The BCM technology cannot be widely developed. Therefore, it is especially important to find a more convenient, simple, and effective method to assess the adequacy of dialysis.
In recent years, machine learning (ML) has been widely used in the medical field and has achieved good results. For example, neural networks [4] and the support vector machine (SVM) [5,6] were used to predict the dry weight (DW) of hemodialysis patients. In the field of bioinformatics, lots of ML technology have been well used in drug discovery [7][8][9], protein function [10,11], and disease analysis [12].
ML-based predictive models can also be used to quickly estimate the adequacy of dialysis. This calculation method can provide a reference for clinical practice. Takagi-Sugeno-Kang Fuzzy Systems (TSK-FS) [13][14][15] are well known for good interpretability [16] and approximation accuracy [17,18]. In this study, we developed an effective graph-based Takagi-Sugeno-Kang Fuzzy System (G-TSK-FS) model to predict the adequacy of dialysis.

Patients.
From January 2018 to December 2020, this study collected the data of 250 patients from Wuxi People's Hospital, China. The criteria of selection are (1) patients over 18 years old, (2) patients without severe infection and heart failure within 30 days, (3) patients receiving maintenance hemodialysis for more than three months, (4) patients with no history of mental illness, and (5) patients who are informed and volunteered to participate in this study. The exclusion criteria are (1) patients who withdrew midway and (2) incomplete data.
All patients have received hemodialysis (HD) or hemodiafiltration (HDF) through the Fresenius machine. They were all dialyzed for four hours. The dialysate was fixed at 500 ml/min. Table 1 shows the gender distribution, average age, mean predialysis body weight (preBW), average ultrafiltration level (UF) (the difference between weight before and after dialysis), average blood pressure, average heart rate, and average blood flow.

Blood
Sampling. Each patient contains two blood samples: (1) before dialysis, a sample is collected from a vascular access vein without anticoagulant. Before collecting, we collected 10 milliliters of blood from those patients who used hemodialysis catheters as vascular access and (2) the other sample is obtained from the inlet of extracorporeal circulation before the end of dialysis. When the blood sample is taken, the blood flow rate will be slowed to 50 ml/min. At this time, the dialysate stops flowing and blood can be collected after 15 seconds.
The Kt/V is used as a "gold standard" for postdialysis, and predialysis eqU is calculated as where Uf is ultrafiltration, BW is postdialysis body weight, and Thd is the duration of the dialysis session in hours. R = Upost/Upre.

Graph-Based TSK Fuzzy
System. In this work, we use TSK-FS to predict the Kt/V of a hemodialysis patient. For a classic 1-order TSK fuzzy system, the fuzzy inference rules are defined as follows. TSK fuzzy rule R k is as follows.
where μ k ðxÞ and μ k ðxÞ are the fuzzy membership function and normalized function via fuzzy set A k . And μ k ðxÞ can be calculated by where μ A k i ðx i Þ is the fuzzy membership function of the kth rule under the ith input variable. In general, TSK-FS uses the Gaussian membership function: where c k i and δ k i are two parameters of the ith variable value of the fuzzy set k. Fuzzy C-means (FCM) is employed to estimate the following two parameters: where u jk is the fuzzy membership of the jth sample under the kth fuzzy set by FCM clustering. h denotes the scale parameter. When the premise (if-parts) of the TSK-FS is determined, let And equation (2) (then-parts) can be formulated as Computational and Mathematical Methods in Medicine So, the problem of TSK-FS training can be regarded as solving linear regression: where y ∈ R N×1 and the true value to be approximated and the feature after fuzzy rule mapping, respectively. N denotes the number of training samples. K ⋅ ðd + 1Þ is the dimension after K fuzzy rule mapping. To improve the generalization performance of the model, we add the Laplace regularization term to equation (8): where β and λ are the coefficients of the two regularization terms. We derive formula (9) and get the solution where L ∈ R N×N is the Laplacian matrix, which can be calculated as where D ∈ R N×N is a diagonal matrix, D ii = ∑ N j=1 S ij . Similarity matrix S ∈ R N×N is built by cosine similarity of two feature vectors. We call this model as graph-based TSK-FS (G-TSK-FS), and the frame diagram of TSK-FS is shown in Figure 1. The least squares is employed to solve the optimization problem of G-TSK-FS.

Result
In this work, we test G-TSK-FS and other predictors on the dataset. Each model is evaluated with the root mean square error (RMSE) [5,19], R-squared, and adjusted R-squared  [20,21]. In addition, Bland-Altman analysis is also used to evaluate the agreement of two different methods (between clinical methods and predictive models).

Selection of Parameters for the Model.
In order to make the model have the best prediction performance, we use the grid search method to get the best parameters of the model. G-TSK-FS has three parameters, including K, λ, and β. The range of these parameters is set as K ∈ f1, 2, 3, 4, 5, 6, 7, 8, 9, 10g and λ, β ∈ f2 −10 , 2 −9 , 2 −8 , 2 −7 , 2 −6 , 2 −5 , 2 −4 , 2 −3 , 2 −2 , 2 −1 , 2 0 g. First, we fix β = 2 0 to search for the best K and λ. The search results are shown in Figure 2. It can be seen that the RMSE value is the minimum (0.1950) when K = 2 and λ = 2 −6 . Then, K and λ are set as 2 and 2 −6 and β is set from 2 −10 to 2 0 with steps of 2 (in Figure 3). At last, the best RMSE is obtained under β = 2 −5 . In addition, the adjustable parameter of the kernel width of the Gaussian membership function is h = 2.

Comparison to Other Predictive Models.
To evaluate the performance of our model, other predictive models are also tested on our dataset. They are linear regression (LR) [22,23], support vector regression (SVR) [24], artificial neural network [25] based on the back propagation algorithm (ANN), and standard TSK-FS. Table 2 shows the results of RMSE, R-squared, and adjusted R-squared. In general, the smaller RMSE (close to 0), the larger R-squared, and adjusted R-squared (close to 1) indicate that the model has better prediction performance. It can be seen from the table that our method (G-TSK-FS) obtains the smallest RMSE (0.1578) and the largest R-squared (0.7523) and adjusted R-squared (0.7222). In addition, G-TSK-FS has increased by 0.0181 (R -squared) and 0.0204 (adjusted R-squared) on the basis of TSK-FS. This shows that the model has better generalization performance after Laplace regularization. Figure 4 shows the distribution of predicted values (all models) and true Kt/V. From the 150th to 160th samples, each model has severe jitter, which may be caused by the noise during the data collection process.

Bland-Altman
Analysis. The Bland-Altman plot is a useful tool, which can evaluate the agreement between predictive methods and the clinical method. Table 3 and Figure 5 show the results of five models via Bland-Altman analysis. In general, the lower the average difference (closer to 0) and the smaller the error acceptance range (95% confidence zone is between −1.96 SD and +1.96 SD), the better the agreement between the model and the clinical method. From the  16.3001) obtain the smaller range of agreement. It can be found in Figure 5 that the errors of LR and ANN for some points are very large, and the differences are greater than ±50%. For LR, ANN, SVR, TSK-FS, and G-TSK-FS, the ratios of disagreement interval are all close to 5%, which means that the prediction methods are equivalent to clinical methods. Generally, when the value is less than 5%, the prediction model can be completely equivalent to the clinical method. The results of the evaluation show that G-TSK-FS has the potential to help clinical evaluation of Kt/V with low cost. Over output Fuzzy set Input layer Figure 1: The frame diagram of TSK-FS. 4 Computational and Mathematical Methods in Medicine

Discussion
The kinetics of urea removal is very complicated [26], and blood is usually drawn to calculate Kt/V. What is more, strict blood collection procedures should be followed during dialysis. It is greatly affected by many factors, which will directly affect the calculation accuracy of the Kt/V value [27]. In our research, we found that adequate dialysis is related to age, gender [28], ultrafiltration [29], dry weight, dialyzer surface area, blood flow [30], DBP, SBP, and heart rate before and after dialysis. It is consistent with a previous study [31]. This indicates that these clinical features can be used to assess the ability of dialysis.
LR, ANN, and SVR are regression methods, which have been widely used in many fields. In our work, the TSK-FS method achieves better results. It is more suitable for our task. The results show that the value of Kt/V predicted by the G-TSK-FS is close to the clinical approach. G-TSK-FS obtains the smallest RMSE (0.1578) and the largest R -squared (0.7523) and adjusted R-squared (0.7222). In addition, the smaller range of agreement (−17.9686 to 16.3001) and the ratio of disagreement interval (close to 5%) show that it is a potential computational model to replace clinical methods.
Although clinical attention has been paid to the value of Kt/V in patients. Few scholars have used G-TSK-FS prediction and patients' clinical characteristics to predict patients' dialysis adequacy. In the field of precision medicine, more scholars pay attention to clinical prediction models [32][33][34][35][36]. Assessing the adequacy of dialysis requires repeated blood tests, which increases patient costs. In addition, the results of the adequacy test are affected by many factors, such as the quality of blood sample collection, the time of blood sample submission, and the reliability of test results. We study machine learning based on big data. Data related to the prediction model are clinical characteristics of patients. We use machine learning and other clinical data of the patient, which is convenient for clinical collection and noninvasive operation and will not increase the patient's payment, to calculate Kt/V.    Computational and Mathematical Methods in Medicine

Conclusions
Our method has made some progress in predicting Kt/V. However, we do not take the noise samples or the characteristics of the noise into account. In addition, the number of samples collected has not yet reached a certain scale. In future work, we will introduce other machine learning techniques such as sample filtering and feature selection [37,38] to deal with various types of noise. At the same time, further expanding the patient sample size is also the work of the next step.

Data Availability
The data used to support the findings of this study are available from the corresponding authors upon request.

Ethical Approval
This study has been approved by the ethics committee (KY21002).

Consent
Written informed consent has been signed by all participants.

Conflicts of Interest
The authors declare that they do not have any conflict of interest.