The Impact of Positive Fluid Balance on Sepsis Subtypes: A Causal Inference Study

Introduction Sepsis, the leading cause of death in hospitalized patients globally, was investigated in this study, examining the varying effects of positive fluid balance on sepsis subtypes through causal inference. Methods In this study, data from the eICU database were utilized, extracting 35 features from sepsis patients. Fluid balance during ICU stay was the treatment, and ICU mortality was the primary outcome. Data preprocessing ensured linear assumptions for logistic regression. Binarized positive fluid balance with mortality was examined using DoWhy's logistic regression, while continuous data were analyzed with random forest T-learner. ATE served as the primary metric. Results Results revealed that septic patients with higher fluid balance had worse mortality outcomes, with an ATE of 0.042 (95% CI: (0.034, 0.047)) using logistic regression and an ATE of 0.0340 (95% CI: (0.028–0.040)) using T-learner. In the pulmonary sepsis subtype, higher mortality was associated with increased fluid balance, showing an ATE of 0.047 (95% CI: (0.037, 0.055)) using logistic regression and an ATE of 0.28 (95% CI: (0.22, 0.34)) with T-learner. Conversely, urinary sepsis patients had improved mortality with higher fluid balance, presenting an ATE of −0.135 (95% CI: (−0.024, −0.0035)) using logistic regression and an ATE of −0.28 (95% CI: (−0.34, −0.22)) with T-learner. Conclusion Our research implies that fluid balance impact on ICU mortality differs among sepsis subtypes. Positive fluid balance raises mortality in sepsis and pulmonary sepsis but may protect against urinary sepsis. Further trials are needed to confirm these findings.


Introduction
Sepsis is a dysregulated host immune response to active infection that can lead to end-organ damage.It is a leading cause of death in hospitalized patients, with an estimated 28.9 million cases and 6.3 million deaths worldwide each year [1].Although thoroughly studied, a few novel treatments have been developed over the past few decades.Sepsis is a complex and multifaceted syndrome, with variations in the causative pathogen, site of infection, and degree of dysregulated host response [2].Studying a heterogeneous disorder, such as sepsis, with the assumption of uniformity reduces the probability of identifying treatments that may be promising for larger subtypes [3].Tis may explain why many therapies with biological plausibility do not translate into clinical results.Te challenges of designing sepsis trials may be attributed to their heterogeneity, underscoring the need to identify specifc subtypes or "phenotypes" of the disease that may beneft from targeted treatments.Identifying these sepsis phenotypes may improve our understanding of the underlying mechanisms of the disease and enhance the design of clinical trials to achieve more successful treatment outcomes.Recent studies using unsupervised machine learning have shown promising results in characterizing sepsis phenotypes, ofering hope for the future of more efective sepsis trial design and treatment [4].
Te goal of fuid resuscitation in sepsis treatment is to restore blood pressure and improve organ perfusion [5,6].However, recent studies have shown that excessive or prolonged fuid administration can increase mortality [7,8].Nevertheless, it is plausible that some sepsis subtypes may beneft from positive fuid balance, while others may be harmed.Fluid administration based on sepsis subtypes has not yet been investigated.Figure 1 illustrates the current method for studying sepsis as a monolithic entity.However, we propose that larger subtypes should be studied separately, given the heterogeneity of sepsis, with the hope of providing individualized and precision medicine.
Randomized control trials remain the gold standard for estimating treatment efects, as this limits the number of confounders.However, they are resource-laden and expensive and can have methodological problems.Terefore, there has been increased interest in statistical and machinelearning methods to estimate causal efects called causal inference [9,10].Statistical causal inference determines the causal efect of treatment or exposure on an outcome by using statistical methods.An example of a statistical method used in causal inference is inverse probability of treatment weighting (IPTW).IPTW uses the probability of receiving treatment to balance the baseline characteristics between the treated and untreated groups, thereby allowing for a more accurate estimate of the causal efect of the treatment on the outcome [11].Using causal inference, we aimed to retrospectively discern the varying efects of positive fuid balance on sepsis subtypes.We employed a simple mental model to separate sepsis subtypes according to the site of infection.We hypothesized that positive fuid balance will negatively impact sepsis mortality.

Ethics Statement.
Tis study analyzed a publicly available anonymized database with approval from a preexisting institutional review board.

Sample Selection. Te eICU Collaborative Research
Database is a multicenter intensive care unit database with data from over 200,000 intensive care unit (ICU) admissions monitored by eICU programs [12].Te eICU database comprises 200,859 patient unit encounters for 139,367 unique patients admitted between 2014 and 2015 from 208 hospitals located throughout the US.Adult patients were included in eICU (Figure 2).In addition, patients diagnosed with urinary or pulmonary sepsis were queried from the dataset.35 features were extracted from the eICU database.Te primary outcome was ICU mortality, whereas the primary treatment was the total fuid balance in the ICU.
Feature sets with more than 40 percent missing values were dropped, and scales and transformations were applied to meet the linear assumptions for logistic regression [1].Based on the initial visual analysis of outliers in the fuid balance feature using violin plots, we excluded patients with fuid balance greater than 15 and less than -15 liters.Te

Treatment Efect
Figure 1: Te current approach to studying sepsis treatment that diminishes the heterogeneity of the syndrome.Te second image provides a more nuanced approach as it separates large subtypes within sepsis and may allow for more individualized treatment discovery.

2
Critical Care Research and Practice Sklearn robust scaler was used [13].Te data remained highly skewed; therefore, further transformation using the Yeo-Johnson transformer was utilized [13].Outliers were removed to reduce the skew for all features to be less than 0.5 and greater than −0.5.Te three sepsis groups were compared in this analysis.Te frst group comprised the entire sepsis cohort, which was compared with the pulmonary and urinary sepsis groups.Our primary analysis method was logistic regression in the DoWhy library, which requires binarization of treatment features.Te net fuid balance of each sepsis subtype was binarized using the SKlearn binarizer with the mean (after initial feature transformation to create normal distribution) of the features as the cutof.Using logistic regression as the primary model and AUC as the metric, recursive feature selection was performed using the library from the feature engine.
We utilized the DoWhy library in Python to perform causal inference on the data in four steps: model, identify, estimate, and refute [14].Our primary metric for evaluating the degree of causality is the average treatment efect (ATE).In the model step, we utilized the domain expertise of the four intensivists to identify confounders and efect modifers to construct a formal causal model.First, using domain expertise for our directed acyclic graph, nine features were identifed as efect modifers and confounders (Figure 3).Second, the causal estimate was determined, which in our analysis was noted to be the back-door criterion [9].Tird, the causal efect was derived via inverse propensity weighting with logistic regression or a metalearner via a random forest model [15,16].Finally, multiple methods refute the estimate by adding a random common cause and data subset analysis, assuming that ATE should not vary signifcantly from the original values if our results are valid.
Our primary model used to infer ATE for the binarized treatment was inverse probability weighting (IPTW) with logistic regression.IPTW is a method used to estimate the causal efect of a binary treatment on an outcome variable.It involves weighting each individual in the sample according to the inverse of their probability of receiving the treatment and then estimating the treatment efect using a weighted regression model.
We further validated our fndings by analyzing the continuous treatment values of positive fuid balance using a machine-learning method called T-learners from the EconMl library [17].T-learners combined a treatment assignment and response model to calculate the causal efects of treatment on outcomes.It can be implemented using various machine-learning algorithms, including the random forest model.Random forest is a machine-learning algorithm that creates a collection of decision trees and uses them to make predictions.One of the strengths of this method is its ability to handle both continuous and binary treatments.Considering the robustness of tree-based models to outliers and scales, minimal preprocessing of the data was performed for learner analysis beyond initial outlier removal using visual inspection.

Results
Tis study aimed to evaluate the causal efect of fuid balance on ICU mortality in a sample of patients admitted to the ICU.Summary statistics stratifed by mortality are presented in Table 1, which demonstrates a statistically signifcant diference in all features between alive and expired patients.Te mortality outcome displayed a marked imbalance, with only ten percent of the patients having died.Te results showed that expired patients had a higher mean fuid balance than the live cohort.
To estimate the causal efect of fuid balance on mortality, we utilized two models: an IPTW logistic regression model with binarized treatment and a random forest T-learner model with continuous treatment.ATE was calculated for sepsis subtypes (Table 2).Te results of our study with the logistic regression/binarized analysis showed that sepsis overall had worse mortality outcomes, with an ATE of 0.042 (95% CI: (0.034, 0.047)).In addition, the T-learner model that utilized continuous fuid balance values demonstrated an ATE of 0.034 (95% CI: (0.028, 0.040)).Tis suggests that patients with sepsis who received higher fuid balance had worse mortality outcomes.
When looking at specifc subtypes of sepsis, pulmonary sepsis had a worse outcome than the total sepsis group, with an ATE of 0.047 (95% CI: (0.037, 0.055)) in the IPTW model.
Furthermore, we performed a refutation analysis using the DoWhy library.We chose two methods to perform refutation: adding a random confounder and utilizing only a subset of data.Refutation analysis in causal inference adds robustness to the results by challenging the validity of causal assumptions.Te DoWhy library refutation method performs this by adding a random common cause or using Critical Care Research and Practice a subset of data.In doing so, it assesses the sensitivity of the causal efect estimates to the choice of variables and data.If the causal efect estimates are robust to these challenges, confdence in the validity of the causal inference results is increased.Our refutation fndings, summarized in Table 3, demonstrated minimal diferences in ATE for both methods of refutation.

Discussion
Our study shows that there is indeed variation in ATEs among sepsis subtypes with positive fuid balance.We used causal inference techniques to discern the efect of positive fuid balance on sepsis subtypes.Te results of our study indicate that positive fuid balance negatively impacts sepsis overall, but there was signifcant heterogeneity within the subtypes.Te outcomes of pulmonary sepsis were signifcantly worsened by positive fuid balance, whereas those of urinary sepsis improved.With higher fuid balance, we hypothesized that pulmonary sepsis would have negative efects, but we expected urinary sepsis to also have negative efects, which was not the case.Tis relationship was consistent in both our binarized treatment IPTW model and the continuous treatment T-learner model.
Sepsis can take many diferent forms and is characterized by marked heterogeneity.One approach to understand the heterogeneity of sepsis is to classify patients into diferent phenotypes or subtypes based on their clinical characteristics and response to treatment [4].Sepsis phenotypes can be defned based on various factors, including the underlying source of the infection, the presence of organ dysfunction, and degree of infammation, to name a few.Prior studies   have even phenotyped them based on genomics [18].Our study employed a more simplistic but likely practical phenotyping of sepsis at the site of infection.One possible mechanism for the negative efect of positive fuid balance in sepsis and pulmonary sepsis is that it may lead to fuid overload in the setting of known sepsis-related glycocalyx damage [19].Tis is empirically supported by prior research, which suggests that fuid overload is associated with increased mortality in critically ill patients with sepsis [20][21][22].Pulmonary sepsis appears to be especially sensitive to positive fuid balance, as demonstrated in prior studies [23].Seethala et al. reported that patients with pneumonia as the primary site of sepsis had an odds ratio of 2.31 for progression to acute respiratory distress syndrome (ARDS) in the setting of positive fuid balance, and those that progressed to ARDS had higher mortality [24].Patients with pulmonary sepsis may be at an increased risk of progression to ARDS because of positive fuid balance due to increased capillary permeability, leading to pulmonary edema.It is unclear why positive fuid balance appeared to have a protective efect on urinary sepsis in our study, but this fnding needs to be validated prospectively.
One of the key strengths of this study is the use of causal inference techniques, specifcally IPTW and T-learner models, to estimate the causal efect of fuid balance on ICU mortality in sepsis subtypes.We employed multiple models with both binary and continuous values, which we believe adds credibility to our assertions.Compared to traditional observational methods, this method provides a more robust and accurate estimate of causal efects.Our study also employed refutation analysis, which is a robust way to validate causal assumptions, adding further confdence in the validity of the results.In addition, although our study was retrospective, the sample size was relatively large.We believe that the simplicity of our phenotyping method is strength as it provides a simple mental model that can potentially be applied with relative ease.Our study adds to the current evidence by demonstrating that the efect of positive fuid balance on sepsis outcomes may vary depending on the subtype of sepsis, which, to the best of our knowledge, has not been studied.Our results suggest that positive fuid balance may harm patients with sepsis overall; some subtypes fare worse, while others have a benefcial efect.One potential implication of our fndings may be the refnement of sepsis fuid expansion guidelines that account for the sepsis subtype as a factor in resuscitation.By phenotyping sepsis into subtypes based on the site of infection, our study provides a more nuanced understanding of the efects of fuid balance on sepsis outcomes.
A practical application of our fndings is to adjust fuid resuscitation strategies in sepsis according to the suspected site of infection based on clinical presentation, cultures, and imaging.For example, a reduced initial fuid bolus could be considered for patients presenting with pulmonary sepsis as compared to urinary sepsis, given the increased risk of harm with positive fuid balance our study found in pulmonary sepsis.If the patient with pulmonary sepsis remains hypotensive after an initial conservative fuid bolus, earlier initiation of vasopressor therapy may be preferred over additional fuid boluses to avoid worsening pulmonary edema and progression to ARDS.In contrast, patients presenting with urinary sepsis may beneft from a more liberal initial fuid bolus if they remain hypotensive, given the potential mortality beneft seen with higher fuid balance in this subtype in our analysis.As additional microbiology and imaging data become available, the working diagnosis of the sepsis source may be refned, and fuid management adjusted accordingly.While our retrospective analysis provides a foundation, prospective clinical trials are needed to validate optimal individualized fuid strategies based on the sepsis source.Our study has several limitations.Tis was a retrospective analysis, and the conclusions must be validated prospectively.
We utilized causal inference techniques with the aim of implying causality, but the gold standard remains RCT.Our primary causal inference model can only handle binary treatments, but fuid balance data are continuous, which requires a signifcant amount of preprocessing and transformation with the possibility of information loss in these transformations.We supplemented the binarized analysis with concurrent analysis of the continuous data to allay this limitation.In addition, the T-learner model requires minimal preprocessing, reducing the concern that signifcant preprocessing required in the logistic regression model limits our inferences.Te visual approach to outlier removal could introduce biases, but our intention was to avoid large impacts from extreme outliers.We mitigated this by also using the Sklearn robust scaler and Yeo-Johnson transformer to minimize the impact of extreme values.While choosing the mean as a threshold for binarization initially presents a limitation, we mitigated this efect by reducing the data skew to less than 0.5, ensuring a more symmetric distribution and enhancing the appropriateness of the mean as a representative measure of the central tendency for our analysis.

. Conclusions
In conclusion, our study showed a variation in the efect of positive fuid balance on sepsis subtypes.We used causal Critical Care Research and Practice inference techniques to estimate the causal efect of fuid balance on ICU mortality in sepsis subtypes and found that positive fuid balance negatively impacted sepsis overall but with signifcant heterogeneity within the subtypes.However, our study was retrospective, and the conclusions must be validated prospectively.

Figure 2 :
Figure2: A total of 139,360 patients were available in the eICU database, out of which 17,480 had an initial diagnosis of sepsis and greater than 18 years of age.Te two largest subgroups were identifed as pulmonary and urinary sepsis.

Figure 3 :
Figure 3: A directed acyclic graph (DAG) is a graphical representation of a set of variables and the relationships between them.Efect modifers (light blue) and confounders (blue) are two important concepts in causal inference.Efect modifers are variables that modify the efect of a treatment or exposure on an outcome.Confounders, on the other hand, are variables that are associated with both exposure and outcome and thus may bias estimates of the treatment efect.

Table 2 :
Average treatment efect (ATE) of sepsis subtypes using inverse probability weighting (IPTW) with logistic regression and T-learners machine-learning models.

Table 3 :
Refutation results from DoWhy library for sepsis subtypes: comparison of estimated efects with new efects and associated P values.P values >0.05 indicate no statistically signifcant diference between original and refutation ATE estimates, supporting the validity of the original estimates.