Urinary BA Indices as Prognostic Biomarkers for Complications Associated with Liver Diseases

Hepatobiliary diseases and their complications cause the accumulation of toxic bile acids (BA) in the liver, blood, and other tissues, which may exacerbate the underlying condition and lead to unfavorable prognosis. To develop and validate prognostic biomarkers for the prediction of complications of cholestatic liver disease based on urinary BA indices, liquid chromatography-tandem mass spectrometry was used to analyze urine samples from 257 patients with cholestatic liver diseases during a 7-year follow-up period. The urinary BA profile and non-BA parameters were monitored, and logistic regression models were used to predict the prognosis of hepatobiliary disease-related complications. Urinary BA indices were applied to quantify the composition, metabolism, hydrophilicity, and toxicity of the BA profile. We have developed and validated the bile-acid liver disease complication (BALDC) model based on BA indices using logistic regression model, to predict the prognosis of cholestatic liver disease complications including ascites. The mixed BA and non-BA model was the most accurate and provided higher area under the receiver operating characteristic (ROC) and smaller akaike information criterion (AIC) values compared to both non-BA and MELD (models for end stage liver disease) models. Therefore, the mixed BA and non-BA model could be used to predict the development of ascites in patients diagnosed with liver disease at early stages of intervention. This will help physicians to make a better decision when treating hepatobiliary disease-related ascites.


Introduction
Cholestatic liver diseases is a diverse group of hepatobiliary diseases associated with limitations in bile flow due to a failure of bile flow or an impairment in bile production [1]. Relatively common cholestatic liver diseases include primary biliary cirrhosis (PBC) [2], primary sclerosing cholangitis (PSC) [2], and alcoholic liver diseases [3].
Aspartate transaminase (AST), alanine transaminase (ALT), alkaline phosphatase (ALP), glutamyl transferase (GGT), serum creatinine, protime, and INR (international normalized ratio) are commonly used biomarkers for the diagnosis and prognosis of liver diseases [21][22][23][24]. However, these biomarkers are not specific to bile duct or liver injuries, and may be related to nonhepatobiliary conditions [21]. Therefore, models with multivariate parameters/markers were developed to better predict the prognosis of liver diseases with higher accuracy than individual parameters [25,26].
Models with multivariate parameters are used to predict survival of hepatobiliary disease-related complications such as the Child-Turcotte-Pugh (CTP) and the Mayo model for end-stage liver disease (MELD) scores. The CTP score was originally used to determine the risk of shunt surgery for severity of liver disease and its complications, such as GI bleeding and encephalopathy [27,28]. The MELD score was originally used to estimate survival of liver patients undergoing the transjugular intrahepatic portosystemic shunt (TIPS) [29]. The MELD score is currently used to determine patients' eligibility for liver transplantation [30,31]. In addition, the MELD score is used as a predictor of liver disease complications, such as GI bleeding and portal hypertension [27,29]. Even though the CTP and MELD scores are widely used worldwide, they still have several limitations. Variables of ascites and encephalopathy are easily affected by extraneous factors in the CTP score [29], while the MELD score has a poor evaluation for patients with cholestatic liver disease-related complications, such as ascites and encephalopathy [25].
More recently, bile acids (BA) have been considered as potential biomarkers for prognosis of hepatobiliary diseases [1,32,33]. BA are synthesized in the liver and excreted into bile, which flows to the small intestine via the bile duct [34]. BA have many physiological functions, such as fat absorption and cholesterol elimination [35]. Compared to their physiological functions, BA also exhibit pathological effects at high BA concentrations. They are associated with necrotic effects on mitochondria, detergent effects on biological membranes, and cancer promoting effects [36,37]. There are a plethora of human and animal studies illustrating the link between the accumulation of toxic BA in the liver, blood and extrahepatic tissues, and unfavorable liver disease prognosis [1,32,38,39].
However, BA have not been widely used in the clinic as biomarkers for liver diseases due to several limitations. Both individual and total BA concentrations have high inter-and intravariability under normal conditions due to several factors including weight, gender, and alcohol consumption, food ingestion, diurnal variation, and medication intake. Therefore, the normal baseline ranges are difficult to establish [40][41][42][43][44].
To address these limitations, we have established the concept of "BA idices." BA indices are ratios calculated from the absolute individual BA concentration and their metabolites [1,32,45,46]. BA indices have markedly low inter-and intraindividual variability and are more resistant to the above-mentioned cofactors than absolute BA concentra-tions. For example, the absolute total and individual BA concentrations increased more than 2-fold in individuals one hour after eating, while BA indices changed less than 10% in the same individuals [32]. Furthermore, we have demonstrated that urinary BA indices outperformed the currently used blood liver enzymes as biomarkers for cholestatic liver diseases [1,32,47]. In addition, we have recently developed a BA-based survival model (the BA score (BAS) model) to predict the prognosis of cholestatic liver diseases [48]. BAS had a higher true-positive and true-negative prediction of 5-and 3-year death and liver transplant than other non-BA models including MELD.
Multivariate markers and models are used to predict the survival of cholestatic liver diseases [49,50]. However, very few studies have addressed the prognosis of cholestatic liver disease-related complications. For example, the CTP score has widely been used in the prognosis of cirrhosis, but it does not provide clear guidance of prognosis for cirrhotic patients with complications [51]. Similarly, the MELD score has extensively been used to prioritize cirrhotic patients awaiting liver transplantation [52], but it does not correlate with cirrhosis-related complications, including encephalopathy and bacterial peritonitis [53]. Therefore, there is a critical need for markers/models to particularly predict complications of liver diseases.
In this study, we have expanded the application of BA indices to predict complications, especially ascites, in patients with liver diseases. The study focuses on developing prognostic models based on BA indices to predict the development of ascites in liver patients.

Materials and Methods
2.1. Study Participants. The study population was described in details previously (cite our most recent paper [1,32,46,48]. Briefly, patients with hepatobiliary conditions were diagnosed by University of Nebraska Medical Center's (UNMC) hepatology Clinic (Omaha, NE, USA). The institutional review board (IRB) approved this study at UNMC. Hepatobiliary conditions included chronic hepatitis C (64), chronic hepatitis B (15), Laennec's cirrhosis (105), primary biliary cholangitis (PBC) (12), primary sclerosing cholangitis (PSC) (15), alpha-1-antitrypsin deficiency (5), and cryptogenic cirrhosis (11). The following complications were diagnosed and monitored by the hepatologists: ascites (62), bacterial peritonitis (2), encephalopathy (36), GI bleeding (18), hepatobiliary carcinoma (15), hepatorenal syndrome (1), and portal hypertension (106). Two-hundred fiftyseven patients with cholestatic liver diseases between the ages of 19 and 65 years (121 female and 136 male) were recruited and treated at the UNMC from November of 2011 to December of 2018 into the study. Thirty milliliters' urine samples were collected from patients on their first and follow-up visits to the hepatology clinic. All urine samples were stored at -80°C before BA analysis using liquid chromatography-tandem mass spectrometry (LC-MS/MS) until analyzed. The study was approved by the institutional review board (IRB) at UNMC and written informed consents were provided for all participating subjects. The 2.5. Calculation of BA Indices. The BA profile in urine was characterized using BA "indices," as we have described previously [1,32,46,48]. Table 1  12α-OH BA are formed by CYP8B1 in the liver and include DCA, CA, Nor-DCA, and 3-dehydroCA. Therefore, CYP8B1 activity can be measured by the ratio of 12α-OH BA to the remaining of all other BA (non-12α-OH BA). Another marker for CYP8B1 is the ratio of CA to CDCA because CA is formed by the 12α hydroxylation of CDCA. In the same way, the ratio of 12α-OH (DCA, CA, Nor-DCA, and 3-dehydroCA in all their forms) to non-12α-OH (HDCA, CDCA, UDCA, LCA, MDCA, MCA, HCA, 12oxo-CDCA, 6-oxo-LCA, 7-oxo-LCA, 12-oxo-LCA, isoLCA, and isoDCA in all their forms) was calculated.
BA are primarily metabolized by sulfation, glycine (G), and taurine (T) amidation in the liver. The percentage of sulfation of individual BA was calculated as the ratio of the concentration of sulfated BA, in both the unamidated and amidated forms, to the total concentration of individual BA in all of their forms (unamidated, amidated, unsulfated, and sulfated). The percentage of amidation of individual BA was calculated as the ratio of the concentration of amidated BA, in both the unsulfated and sulfated forms, to the total concentration of individual BA in all of their forms (unamidated, amidated, unsulfated, and sulfated). In addition, percentages of amidation were divided into the percentages of BA existing as taurine (T) or as glycine (G) amidates.
2.6. Statistical Analysis. To develop prognostic models, logistic regression model was used to predict the prognosis of hepatobiliary diseases in terms of developing diseaserelated complications. Models were constructed to predict (i) various individual complications and (ii) all complications combined (pooled) in the entire liver-patient population as well as in the individual disease subtypepopulations (patient groups with specific disease subtypes). All statistical analyses were conducted using the Statistical Product and Service Solutions (SPSS) software, version 26 (IBM corporation, Armonk, NY, USA).
We developed models with six different sets of predictors: (i) BA variables only, (ii) Non-BA variables only, (iii) Mixed BA and non-BA variables, (iv) original model for end-stage liver disease (MELD), (v) MELD variable with coefficients from our data set, and (vi) original MELD modified with BA and/or non-BA variables.
Individual BA and/or non-BA variables were analyzed as possible predictors in a univariate logistic regression analysis. Significant variables (P value < 0.05) were selected from the univariate analysis to include in the multivariate analysis. The backward elimination method was used to avoid multicollinearity and retain the statistically significant variables with retention criteria during the multivariate analysis.

International Journal of Hepatology
The estimated odds ratio (OR) of developing complications by BA and/or non-BA variables was obtained from the final multivariate logistic regression model for all subjects.
whereP is the probability of developing complications; a is the estimated intercept; and b 1 , ⋯, b k represent the estimated regression coefficients for the variables x 1 , ⋯, x k [57].
The final multivariate logistic regression model provides the associations between significant BA and/or non-BA variables and the odds of developing complications. We then computed the predicted probability, which transforms the estimated probabilities of complications to a scale of 0 to 1 using the following equation: Goodness-of-fit was assessed by using the Hosmer-Lemeshow (HL) test for logistic regression models. This test compares the observed number of individuals to the expected number of individuals in each pattern, which shows how well the data fits into the model [57]. In general, the HL test indicates a poor fit if the P value is less than 0.05.
We used akaike information criterion (AIC) for model comparisons among logistic regression models with different sets of predictors [58]. Minimizing AIC values represents a better goodness-of-fit [59]. The AIC values were calculated by where L is the likelihood evaluated at the maximum likelihood estimate and K is the number of parameters in the models [60].
Bootstrapping was used to validate the models. Bootstrapping is a resampling technique used to estimate statistics on a population by sampling a data set with replacements [61]. The parameters included P value, bias, and standard error (SE) [62]. The bootstrapping estimate of bias indicated the difference between the estimates computed using the original sample and the mean of the bootstrap estimate. The SE represented the standard deviation of the estimator and reflects how far our sample estimate deviates from the actual parameters [63]. The range of regression coefficients (B) was defined as a 95% confidence interval of the bootstrap estimator. A bootstrap estimate of bias is the difference between the estimate calculated using the original sample and the mean of the bootstrap estimates. Acceptance criteria of P values were set at 0.05.
We also performed receiver operating characteristic curve (ROC) on the scores from multivariate logistic regression models to determine their optimal cut-off value in differentiating patients with or without ascites. The cut-off values with optimum specificity vs. sensitivity were selected, and the areas under the ROC curve (AUC) values were calculated. AUC of 0.9 or greater is rarely seen, AUC between 0.8 and 0.9 indicates excellent diagnostic accuracy, and any AUC over 0.7 may be considered clinically useful [54,57,64,65].
The performance of the different models in predicting the occurrence of complications was compared using statistical outcomes from the HL test, AIC values, bootstrapping, and AUC values. Table 2 shows a summary of the demographics of patients, who participated in this study. During the 7-year follow-up period, there were 257 patients with cholestatic liver diseases. The development of the following liver disease-related complications was monitored: ascites (62), bacterial peritonitis (2), encephalopathy (36), GI bleeding (18), hepatobiliary carcinoma (15), hepatorenal syndrome (1), jaundice (7), peripheral edema (63), and portal hypertension (106).

Univariate Logistic Regression Analysis for Ascites
Prediction in the Entire Liver-Patient Population. Table 3 shows the results of univariate logistic regression analyses for ascites prediction by BA indices in the entire liverpatient population. The odds ratio (OR) quantifies the magnitude of the risk of developing ascites per one unit as well as We performed the same univariate logistic regression analysis for demographics and non-BA parameters as well (Table 4). For demographics, gender was the only statistically significant variable (P value < 0.05), with the odds of developing ascites being 1.3-fold higher in males than females. For non-BA parameters, increasing levels of creatinine, INR, protime, AST, bilirubin, AST/ALT, and MELD significantly increased the odds of developing ascites, whereas decreasing levels of albumin and ALT significantly increased the odds of developing ascites. For example, for every 20% increase in the INR, the odds of developing ascites increased 1.4-fold (OR: 1.391; P value < 0.05). In contrast, for every 20% increase in the albumin, the odds of developing ascites decreased 0.23-fold (OR: 0.231; P value < 0.05).

Multivariate Logistic Regression Analysis for
Ascites Prediction in the Entire Liver-Patient Population 4.1. The BALDC Model. In multivariate logistic regression analysis, a backward elimination method was used to identify a statistically relevant BA variable from univariate analysis. The only BA variables retained in the multivariate model were % MDCA and % primary BA, which were independently predictive of developing ascites (Table 5(a)). The estimated odds ratio (OR) of developing ascites as a function of BA variables (BA-d OR) for individual patients were calculated using this equation: The predicted probability ðPÞ of ascites as a function of BALDC (BA-P) variables was then calculated using this equation: Figure 1(a) shows the probability of developing ascites ðBA −PÞ as predicted by the BALDC score.
For example, for a patient with a % MDCA of 1%, and % primary BA of 30%, the estimated odds ratio (BA-OR) of developing ascites by BA variables is as follows: Then, the predicted probability of developing ascites (BA-P) by BA variables can be calculated as Furthermore, we tested the effect of the significant demographic variables from univariate analysis, i.e., gender, on this BADLC multivariate model. Gender was retained in the multivariate analysis but with no-minimal improvement of model validation and comparison criteria including bootstrapping, AIC, and ROC-AUC. Therefore, we did not include gender in the multivariate logistic regression model.

The
Non-BA Model. We performed the same multivariate logistic regression analysis for non-BA parameters as well. Albumin level and MELD were the only significant predictive variables of developing ascites (Table 5(b)). The estimated odds ratio (OR) of developing ascites as a function of non-BA variables (non-BA-d OR) for individual patients was The predicted probability ðPÞ of developing ascites as a function of non-BA ðnon − BA −PÞ variables was calculated using this equation: Figure 1(c) shows the probability of developing ascites as predicted by the mixed BA and non-BA score.

The Original MELD Model.
We also performed the same multivariate logistic regression analysis for the MELD parameter (Table 5( The predicted probability (P) of developing ascites as a function of original MELD variables was calculated using this equation: Figure 1(d) shows the probability of developing ascites as predicted by the original MELD score.

Other Hybrid Models.
In addition, we used the same methodology to develop other models (Supplementary  Table S1) including (i) MELD variables with coefficients from our data set to create a model with the original MELD variables, but with model coefficients derived from our data set. In this model, creatinine and INR variables from the original MELD were not statistically significant.   were not significantly different, indicating the logistic regression of these models fit the data well. In contrast, for the original MELD model, the P value of the HL test was 0.029 (P value < 0.05), indicating the logistic regression of the original MELD model did not fit the data well (Table 6). Table 6 also shows the akaike information criterion (AIC) for ascites prediction. AIC values were used to compare models with different error distribution. The AIC values for the BALDC, non-BA, mixed BA and non-BA, and original MELD models were 223.56, 170.81, 167.3, and 180.45. The BALDC model had a larger AIC value than the non-BA, mixed BA and non-BA, and original MELD models. This indicates that the logistic regression of the BALDC model did not fit the data well compared to the other candidate models. Table 7 describes the bootstrapping validation for ascites prediction. Bootstrapping validation results for all four models indicated that the regression coefficients (B) were in the range of 95% confidence intervals, and P values were statistically significant for all covariates (P value < 0.05). Bias values were relatively small (-0.056 to 0.016), which means the estimates calculated using the original sample and the mean of the bootstrap estimate were not significantly different. In contrast, standard error (SE) and relative standard error (RSE) (0.02% to 296.3%) values of the bootstrapping analysis were relatively high, which may reflect our sample estimate derivates far from the actual parameter (Supplementary Figure S1). Figure 2 shows the receiver operating characteristic (ROC) curves of all four models for ascites prediction. The area under the ROC curve for the BALDC, non-BA, mixed BA and non-BA, and original MELD was 0.81, 0.87, 0.88, and 0.86, respectively.
We also calculated the sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predicative values (NPV) from ROC analysis ( Table 6). For instance, in the BALDC model, the sensitivity and specificity were 33.90% and 88.30% and the positive and negative predictive values were 48.80% and 80.20%.
Potential cut-off values of all 4 model scores to best differentiate patients with vs. without ascites were selected based on the optimum sensitivity vs. specificity from ROC analysis. The ROC-optimum cut-off values for BALDC, non-BA, mixed BA and non-BA models, and original MELD models for ascites prediction were -0.99, -1.18, -1.06, and -1.09, respectively (Table 6).
Moreover, we tested if patient populations with scores below vs. higher than these optimum cut-off values can be distinguished using ROC analysis. The P value of AUCs was used to find statistically significant differences between the low-vs. high-score populations ( Figure 3 and Table 8). The null hypothesis for P value of AUCs was AUC = 0:5.

Prediction for Other Complications.
We also followed the same approach to predict other complications of liver diseases including bacterial peritonitis, encephalopathy, GI bleeding, hepatobiliary carcinoma, hepatorenal syndrome, and portal hypertension. Supplementary Table S2 shows the ROC analyses, P values of the bootstrapping, HL tests, and AICs for the BALDC models. Supplementary  Table S3-5 show similar results for non-BA, mixed BA and non-BA, and original MELD models.

Discussion
In this study, we have examined the ability of BA indices to predict complications in patients with liver diseases. Logistic regression model was used to predict the prognosis of hepatobiliary diseases in terms of developing disease-related complications. In addition to the BALDC model, we have developed (i) non-BA, (ii) mixed BA, and non-BA variables to compare with the BA-only and non-BA-only models. (iii) MELD variables with coefficients from our data set were used to create a model with the original MELD variables, but with model coefficients derived from our data set. (iv) Original MELD was modified with BA and/or non-BA variables, to test if the performance of original MELD can be improved by adding significant BA and non-BA parameters from the univariate analysis. First, individual BA and non-BA variables were analyzed as possible predictors of developing ascites in a univariate logistic regression analysis. Then, multivariate models were built using backward elimination regression, where only the most significant variables from the univariate regression were retained.
The final multivariate logistic regression models were then validated using bootstrapping method. Goodness-offit criteria also included the HL test, the AIC, and multivariate parameters from the receiver operating characteristic analyses.
For demographics, univariate logistic regression analysis showed that the odds of having ascites was significantly 1.3fold higher in males than females. For non-BA parameters, creatinine, INR, protime AST, bilirubin, AST/ALT, and

10
International Journal of Hepatology MELD increased the odds of having ascites, whereas albumin and ALT decreased the odds of having ascites (Table 4). Using multivariate logistic regression analysis, we have constructed these final models for ascites prediction: Gender was the only significant demographic variable in univariate logistic regression analysis for all models (Table 4). However, it was not included in these models because it resulted in but with no-minimal improvement of model validation criteria including bootstrapping, AIC, and ROC-AUC. Therefore, we did not include gender in the multivariate logistic regression model.
Cholestatic diseases are associated with impaired bile flow to the intestine, which is expected to translate into reduced transformation of primary BA into secondary BA by intestinal bacteria. Therefore, an accumulation of primary and a decrease in secondary BA in the blood may indicate further impairment in bile flow and existing liver disease [1,[66][67][68][69]. This was in agreement with the BALDC model, where increasing % primary BA and decreasing % MDCA (a secondary BA) were the final significant predictors of liver disease prognosis. Furthermore, we have previously demonstrated survival model development for death prediction using cox regression analyses. The same results have shown in their BA model, where increased % CDCA and % Tri-OH BA (both are primary BA) were the significant predictors of liver disease prognosis into death.
As shown in Figure 1, the probability of developing ascites increased as a function of BALDC, non-BA, mixed BA and non-BA, and original MELD scores. In general, logistic regression analysis produces an S-shaped curve, when predicated probability is plotted against the explanatory score [70]. All four models produced such S-shaped curves except for the BALDC score. This is expected in the absence of extreme values of BALDC scores from our data set. How-ever, with more subject enrollment in the future, more extreme BALDC score values; therefore, S-curve shapes are expected.
Hosmer-Lemeshow (HL) test was one of the criteria to evaluate the goodness-of-fit for logistic regression models. The HL test results supported the validity of the BALDC, non-BA, and mixed BA, and non-BA models (P value > 0.05), but not the original MELD model ( Table 6). The original MELD model was the only model with P value < 0.05, which indicates the expected and observed results were significantly different. As an alternative, we considered a probit regression analysis to model the original MELD (data not shown). Based on our finding, the MELD with probit model showed a better performance compared to the logistic regression model; however, it was not fitted well in BA and non-BA models. Therefore, we use the logistic regression model for the entire analyses.
We also used akaike information criterion (AIC) to compare the estimated out-of-sample prediction error from multivariate logistic regression models. Smaller AIC values represent a better goodness-of-fit in model performance [59]. The AIC values of the BALDC, non-BA, and original MELD models were 233.56, 170.81, and 180.45, which were higher than the AIC value of the mixed BA and non-BA model (167.3) ( Table 6).
Models were validated using the bootstrapping method (Table 7). Bootstrapping is a resampling technique used to estimate statistics on a population by sampling a data set with replacement [61]. Random samples were taken one at a time, with replacement from our data set to create a series of 1000 new data sets. Statistics were calculated by comparing these data sets. In the BALDC model, the relative standard error was relatively large because the model parameter (% MDCA) has a high relative standard error (Supplementary Figure S1). This could be due to the fact that % MDCA was not normally distributed in the original data set and because the sample size was relatively small [71]. Despite the high relative standard error, the BALDC 13 International Journal of Hepatology model could be considered to pass the bootstrapping validation given the relatively small sample size of our study. Overall, the bootstrapping validation results supported the validity of the BALDC, non-BA, mixed BA and non-BA, and original MELD models for ascites prediction.
ROC analysis was used to compare the models for their accuracy to predict liver patient prognosis into complications such as ascites. The higher the area under the ROC curve, the greater the overall accuracy of the marker in distinguishing between groups. For prognostic models, AUC of 0.9 or greater is rarely seen. AUC between 0.8 and 0.9 indicates excellent accuracy. And any AUC over 0.7 may be considered clinically useful [72][73][74]. Therefore, all four models show high accuracy for ascites prediction.
ROC analysis was also performed to test sensitivity, specificity, and positive and negative predictive values ( Table 6). The sensitivity is the proportion of true positive patients (patients who were predicted to have ascites and actually did have ascites) to the actual positive patient population (total number of patients who actually did have ascites). The specificity is the proportion of true negative patients (patients who were predicted not to have ascites and actually did not have ascites) to the actual negative patient population (total number of patients who actually did not have ascites). The positive predictive value is the proportion of true positive patients to the total number of predicted positive patients. The negative predictive value is the proportion of true negative patients to the total number of predicted negative patients. The high sensitivity and specificity correspond to the high positive and negative predictive values, and vice versa. Predictive values are more commonly used than sensitivity and specificity in clinical studies [70]. The higher positive and negative predictive values are preferred when comparing model performance. Based on that, we compared positive and negative predictive values for all four models. The non-BA model has higher positive and negative predictive values than other models. In addition, the mixed BA and non-BA model also has high predictive values closed to the non-BA model. Therefore, both non-BA and mixed BA and non-BA models show better model performance than others.
Moreover, ROC analysis was used to determine potential cut-off values which quantify the normal range of biomarkers. The selection of optimum cut-off values is a tradeoff between sensitivity vs. specificity, where lower cut-off values are associated with higher sensitivity but lower specificity, and vice versa. Scores for the BALDC, non-BA, mixed BA and non-BA, and original MELD models were identified as cut-off values with optimum sensitivity vs. specificity, which were -0.99, -1.18, -1.06, and -1.09, respectively (Table 6). For example, a BALDC score of -0.99 was considered an optimum cut-off value in differentiating patients with vs. without ascites because it maintained a balance between sensitivity (74%) vs. specificity (74%).
These ROC optimum cut-off values were used to split the overall patient population into two populations for every model. One population contained patients with model scores higher than the cut-off score and the other contained patients with model scores lower than the cut-off score. The P value of AUCs from the two populations for every model was then used to find statistically significant differences ( Figure 3 and Table 8). The P value of AUCs is smaller than 0.05 and lead to the rejection of the null hypothesis, indicating AUCs are above the reference line (AUC = 0:5), and vice versa. Only ROC-optimum cut-offs for the BALDC score (-0.99) resulted in statistically significant different AUCs based on their P values; therefore, they were able to distinguish high-vs. low-score patient populations.
In addition to ascites, we attempted to develop similar models for the prediction of other common liver disease complications including bacterial peritonitis, encephalopathy, GI bleeding, hepatobiliary carcinoma, hepatorenal syndrome, and portal hypertension (Supplementary Table S3 -5). None of these complications were as accurately predicted as ascites by any of the BALDC and non-BA models. In general, models for the prediction of other complications had lower sensitivity, lower specificity, lower AUC values, and higher AIC values. This could be due to the fact that other complications were less common than ascites (except for portal hypertension) in our study. Overall, improving prediction accuracy would require an increase in the study population to predict all these other complications.

Conclusions
We have developed and validated a prognosis model based on BA indices to predict the development of liver disease complications such as ascites. Other models, including non-BA, mixed BA and non-BA, and original MELD models, were also developed to compare their performance with our BALDC model. Overall, the mixed BA and non-BA model was the most accurate based on AIC and ROC analyses. The mixed BA and non-BA had lower AIC values indicating a smaller error of distribution and a better trade-off between goodness-of-fit vs. degrees of freedom (Table 6). Moreover, the mixed BA and non-BA model had the highest AUC values indicating higher accuracy than other models ( Figure 2). Therefore, the mixed BA and non-BA model could be used to predict the development of ascites in patients diagnosed with liver-disease at early stages of intervention, such as liver transplantation. This will assist in supply allocation and physician decisions when treating liver diseases.

Supplementary Materials
Supplementary Figure S1: histograms for the BALDC, non-BA, mixed BA and non-BA, and original MELD model's variables. Std. Dev represents the standard deviation of each variable. N is the population of each variable. SE is the standard error of each variable. RSE represents the relative standard error of each variable. Supplementary Table S2: prediction of other liver disease complications using BALDC models. Supplementary Table S3: prediction of other liver disease complications using non-BA models. Supplementary  Table S4: prediction of other liver disease complications using mixed BA and non-BA models. Supplementary Table  S5: prediction of other liver disease complications using original MELD models. (Supplementary Materials)