Thermodilution and Fick cardiac outputs differ: Impact on pulmonary hypertension evaluation

P arterial hypertension (PAH) is a fatal and debilitating disease if not detected and treated early (1). A right heart catheterization (RHC) is required to diagnose PAH. Determining the cardiac output (CO) is required for diagnostic purposes to calculate the pulmonary vascular resistance (PVR). CO is also a critical value in the evaluation of PAH patients (2). It is an important prognostic hemodynamic parameter both in the initial evaluation of a patient and in follow-up. CO has also been identified in some series as an important prognostic factor in PAH patients. There are multiple methods to determine CO, but the two most commonly used are the bolus thermodilution (TD) and the indirect (ie, based on assumed rather than directly measured oxygen consumption) Fick methods. In clinical practice, it is not known whether the TD or the Fick method is more appropriate. It is also not clear whether these two methods of CO determinations yield consistent measurements in all patient populations Stouffer, et al. Thermodilution and Fick cardiac outputs differ: Impact on pulmonary hypertension evaluation. Respir 2012;19(4):261-266. BACKGRound: The relationship between thermodilution and indirect Fick cardiac output determination methods has not been well described. oBJeCTIve: To describe the relationship between these two cardiac output determination methods in patients evaluated for pulmonary hypertension and to highlight potential clinical implications. MeTHodS: A retrospective review of charts of all adult patients who underwent a right heart catheterization (RHC) between January 1, 2007 and November 10, 2010, and participated in the pulmonary hypertension program of the pulmonary division at an academic institution was con- ducted. For validation, the charts of all patients who underwent RHC during the same period within the cardiology division were reviewed. ReSulTS: A total of 198 patients underwent 213 RHCs, 79 (40%) of whom had pulmonary arterial hypertension, were included. Forty-three per cent of patients had >20% difference between thermodilution and Fick. a multivariable analysis, the thermodilution-Fick difference increased with age (P=0.001). dISCuSSIon: The presence of such discrepancy in 36% of patients evaluated for heart failure and/or heart transplant validated the results. In total, 37% of the 1315 procedures (213 performed by pulmonologists and 1102 performed by cardiologists) had a difference of >20% between ther- modilution and Fick. ConCluSIon: Significant discrepancy exists between thermodilution and indirect Fick methods. This discrepancy potentially impacts pulmonary arterial hypertension prognostication and diagnosis, and is indepen- dent of TRJ.

Multiple studies from the 1960s and 1970s found a strong correlation between TD (iced and room temperature measurements), direct Fick and indirect Fick, with correlation coefficients (r) ranging from 0.78 to 0.96 (4)(5)(6).However, more recent studies are challenging these traditionally accepted correlations (7,8).Our literature review found only a handful of small studies (<30 to 40 patients/study) that directly compared TD with indirect/assumed Fick (9).It is important to note that most of the early studies used correlation coefficients for such comparisons.Unfortunately, such correlations have significant limitations and, thus, may be misleading in such instances (10).This may be due to the common misconception that strong correlation is equivalent to strong agreement, which is not necessarily true.In our study, we used the more appropriate and accurate Bland-Altman analysis to compare TD and Fick (11) for a more contemporary patient population and catheterization laboratory methodologies.
The present study aimed to describe the concordance between TD and Fick methods for CO determination using contemporary commercially available catheters, oximeters and software in a patient population undergoing evaluation for pulmonary hypertension under real-world, nonexperimental conditions.Because of the cumulative inaccuracies of the multiple underlying assumptions made in each of these methods (Appendix A), we hypothesize that TD and indirect Fick methods produce different data that have crucial clinical implications such as in establishing PAH diagnosis and prognosis.

Study design and population
In the present retrospective cross-sectional study of medical records, the electronic charts of all consecutive adult patients (>18 years of age, regardless of sex or race) evaluated in the PAH centre at the University of North Carolina at Chapel Hill (UNC, North Carolina, USA) were reviewed.The patients were identified by reviewing the UNC PAH physicians' catheterization and clinic schedules.Patients who were seen between January 1, 2007 and November 10, 2010, and underwent an RHC were included.For validation purposes, the medical charts of adult patients who were evaluated by UNC cardiologists for heart failure or heart transplantation and underwent a RHC within the same time period were also reviewed.Patients whose medical records were missing either TD or Fick CO data were excluded from the analysis.Data, including demographics, pertinent medical history, echocardiographic findings and heart catheterization measurements, were recorded for patients meeting the inclusion/exclusion criteria.The measurement techniques were standardized (Appendix B).The present study was approved by the Institutional Review Board at UNC (IRB study 09-1672) and a waiver of informed consent was granted.

Statistical analysis
Sample size estimation: Based on an estimated mean CO of 5.5 L/min for both TD and Fick, and with an SD of the difference between measurements of 2.75 L/min, the sample size calculated a priori that was required to detect a clinically meaningful difference of 20% in CO measurements with a two-sided alpha of 0.025 with 90% power was 94.The sample size of 213 dual measurements that was available for analysis well exceeded the above requirement.Statistical considerations: Percentages of patients with different levels of discrepancy between TD and Fick (TD-Fick) were expressed as a percentage.The agreement between the methods was analyzed as described by Bland and Altman (12).Agreement was expressed as the mean of the differences obtained by the different methods (reflecting bias) ±SD unless otherwise specified.Relative precision was expressed as the mean of the differences ±2SD (Bland-Altman limits of agreement).Additionally, Pearson's correlation coefficient was calculated for comparative purposes and a multiple linear regression model was constructed.All variables in the linear regression model were chosen before data collection.Based on a sample size of 213 measurements and to ensure the stability of the regression model, only 10 variables were allowed into the initial model.Interaction assessment between different variables in the regression model was performed first before checking for significant associations between the different variables and the outcome (ie, TD -Fick difference).Variables that were not statistically significant in the initial model (ie, P>0.05) were dropped from the model after confirming with a partial F test (also with P>0.05).
The Student's paired t test or the Wilcoxon signed-rank test was used as appropriate to compare TD and Fick within each group of patients.The Student's t test was used to compare the CO differences between different subgroups of patients.A two-sided alpha of 0.025 was considered to be statistically significant for all tests without correcting for multiple testing, except as noted above.All statistical analyses were performed using STATA version 11 IC (Stata Corp, USA) for Windows (Microsoft Corporation, USA).

Study population
Data on 256 RHC measurements in the PAH program were available.One set was excluded because the patient was <18 years of age.TD data were missing for 37 patients, Fick data were missing in two patients, and three other patients had both TD and Fick data missing.Therefore, the final study PAH program population was 213 RHC measurements in 198 patients (Table 1).Twelve patients underwent RHC twice and one patient underwent RHC four times.These RHC data were entered as separate data points because a sensitivity analysis Data on 2629 RHC measurements in the cardiology programs were available.One data set was excluded because the patient was <18 years of age.TD data were missing in 863 patients, Fick data were missing in 146 patients, 513 were cardiac procedures (such as cardiac biopsy) but without hemodynamic measurements and four were excluded because of duplicate reports.Therefore, the final study cardiology population was 1102 procedures in 774 patients (of which 384 were procedures in 99 patients who underwent a heart transplant).

Td versus Fick Co measurements PAH program population:
In the PAH program population, 68% of procedures had a difference of ≥10% between TD and Fick, and 43% of procedures had a difference of ≥20%.The bias was −0.39 L/min and the limits of agreement were between −4.44 L/min and +3.66 L/min for the group as a whole between TD and Fick using the Bland-Altman analysis for repeated measures (n=213) (Figure 1).There was a nonuniform and nondirectional discrepancy between TD and Fick over a wide range of CO.The bias was similar between those with PAH versus those without PAH (−0.25 L/min versus −0.49L/min, respectively [P=0.40]) (Table 2).The relationship between TD and Fick was nonlinear, although the statistical software would still generate a best-fit line for the linear regression model, which is shown for comparative purposes (Figure 2).
There was no significant difference in the discrepancy (bias or variability) between TD and Fick in the subgroup of patients with low versus high TRJ (Table 2).Using a multivariable linear regression model adjusted for age, sex, race, body mass index (BMI), left ventricular ejection fraction, hemoglobin, PAH status and transpulmonary gradient, only age showed a statistically significant association with the difference between TD and Fick (beta coefficient= −0.03; P=0.001).When adjusted for the above mentioned covariables, with each 10 years increase in age, the difference between TD and Fick decreased by 0.35 L/min.A sensitivity analysis including TRJ (which was missing in 66 of 213 RHC measurements) in the multivariable analysis did not significantly change the results.This variability was consistent over time (Figure 3).Using PVR >3 Wood units as a diagnostic criteria for PAH, 27 patients (13±2%) would have inconsistency in PAH diagnosis between TD and Fick (Figure 4).

Cardiology program population:
The RHCs performed by cardiologists showed a similar pattern of discrepancy, with 401 (36%) of the cardiology population (141 [37%] of the heart transplant population) having a discrepancy of >20% between TD and Fick.
When combining all of the RHCs performed by pulmonologists (n=213) and cardiologists (n=1102), 37% of these 1315 procedures had a difference of >20% between TD and Fick.The percentage of observations with >20% difference between TD and Fick was similar regardless of the patient population (PAH, non-PAH [in the PAH program], cardiology and heart transplant populations]; P not significant).

Number of procedures may include multiple procedures performed on the same patient. *Thermodilution -Fick CO (in L/min); † Cardiac index (CI) based on thermodilution; ‡ 66 patients did not have a TRJ reported on their echocardiogram, mostly because it was not detected. PAH Pulmonary arterial hypertension dISCuSSIon
To our knowledge, the present study is the largest to describe the discrepancy between TD and indirect Fick CO determination methods.Our data show significant variation for individual patients, as indicated by a high dispersion between the TD and Fick, despite moderate correlation as a group.CO has been identified in some series as an important prognostic factor in PAH patients.Our data indicate that the two most commonly used CO methods often differ.This individual variation may not be as well appreciated by the linear regression plot of TD versus Fick CO on first glance (Figure 2), but the Bland-Altman analysis is more demonstrative (Figure 1).The effect of discrepant CO on PVR calculation may have significant clinical implications.The difference between these two methods of CO determination was comparable between the patients with PAH, without PAH, with cardiovascular disease, and status post-heart transplant and also between the RHCs performed by pulmonologists and those performed by cardiologists.Thus, we believe that an inherent discrepancy exists between TD and Fick, and that this discrepancy is independent of the characteristics of the population studied or the hemodynamics of the right heart.Age was the only variable that was associated with statistical significance with the difference between TD and Fick in the PAH program population (PAH and non-PAH patients).The association between age and the difference between TD and Fick in the cardiology population validates our results (data not shown).
We believe that a 20% difference between these two methods of CO determination is clinically meaningful.This difference cut-point of 20% is reasonable; for example, it could make a difference clinically if a measured CO is 4.5 L/min versus 3.6 L/min as far as inotrope use or other medications optimization.This difference has also been used in multiple other studies investigating agreement between different CO measurements (12)(13)(14).Because there is no gold standard to determine CO (15), our data strongly suggest that measuring CO using only one method might not be adequate.In view of our study design, we are not able to recommend one method versus the other.These data also suggest that in certain patients, especially those with borderline diagnostic criteria for PAH, direct measurements of oxygen consumption to obtain a direct Fick might be worthwhile to obtain a more valid estimate of PVR.
Based on our data, the PVR would vary significantly depending on which measurement was used.Such PVR variation could impact whether a PAH diagnosis is made.PVR variation could also impact prognostication and, thus, the aggressiveness of therapy, with the associated adverse effects, at times, possibly being life threatening (16).PVR was part of the predictive Registry to EValuate Early And Longterm pulmonary arterial hypertension disease management (REVEAL) registry PAH risk score (17) and, thus, an erroneous CO would lead to an erroneous PVR calculation and, thus, would impact prognostication.Other potentially relevant measures and prognostic variables, such as pulmonary vascular compliance (which equals stroke volume/ pulse pressure), could also be influenced by CO measure variability.
TD measurement was initially proposed by Fegler in the 1950s (18), and it became commercially available after the introduction of the pulmonary artery catheter in 1971 (19).The Fick measurement was initially proposed by Adolph Fick in 1870 (20).The direct Fick CO measurement may be as close to the gold standard as one can get; it was used by Fegler to validate the TD (18).The direct Fick typically involves directly measuring the patient's oxygen consumption in the catheterization laboratory rather than using an assumed oxygen consumption index, as is the case with the indirect Fick method (21).The drawback to using the direct Fick CO method is that it is not practical to perform in most catheterization laboratories and, consequently, is rarely used.
In the early days of CO measurements, strong correlations were established between the TD and the direct Fick on one side (4-6, 22,23), and between direct and indirect Fick on the other side with correlation coefficient (r) values >0.9 (24).However, few studies have directly compared TD with indirect Fick measurements.Those early studies with small numbers of patients (average sample size n=25 to n=35 patients) showed a strong correlation between TD and Fick as a whole group.The evidence of intra-individual correlation or concordance is lacking.
Multiple theories exist regarding the appropriateness of TD or Fick in certain populations.The published data overall have been conflicting (25,26).A few small studies (combined n=78) consistently showed that TD overestimated CO in low CO states (27).These data introduced the possibility of a systematic difference between these two methods in low CO states.On the other hand, other evidence showed strong correlations between TD and Fick, even in low CO patients (n=33) (22).One possible reason for the difference between our results and these studies is that the body surface area may not track assumed oxygen uptake above the normal BMI range.The average BMI of our cohort was almost 30 kg/m 2 .The average BMI of the patients in most of the published studies comparing TD and Fick has not been provided.With the significant increase in obesity and

Figure 4) Discrepancy in pulmonary vascular resistance (PVR) based on thermodilution (TD) versus PVR based on Fick cardiac output (CO) (n=213 patients/measurements in the pulmonary arterial hypertension program population)
overweight prevalence in the United States in the past 30 years (28), our cohort -whose BMI tracks the current national average -may not be comparable with older cohorts.In addition, more than two-thirds of our patients were women, which may also not be comparable with older studies.
Our findings, showing lack of effect of TRJ severity on the discrepancy between TD and Fick, are similar to previous smaller studies (22,29,30), but different from others (25,31), suggesting ongoing controversy, which may be resolved by larger future studies.
Limitations of our study include obtaining only one Fick measurement rather than checking multiple mixed venous oxygen saturations and averaging them out, as is the case with TD measurements.However, obtaining only one Fick CO measurement better reflects real-world practice.This being a single-centre study limits the generalizability of our findings, although commonly accepted standard procedures and meticulous techniques were followed.Another potential limitation is that the curve for TD was not available for our review.We based the numbers on the RHC report.Typically, the curve is reviewed in real time on a video monitor, not as a hard copy that could be reviewed at a later time.The 37 patients who were missing TD measurements and thus were excluded from the analysis could, in theory, be different, in some way, from the rest of the study population.Unfortunately, a proper and informative sensitivity analysis to address this possibility could not be performed in view of the retrospective nature of the present study.One potential systemic difference could be that this group of patients had worse hemodynamics and, thus, a special and 'stiffer' pulmonary artery catheter that lacks TD measurement equipment was used.Because the pulmonologists performing these procedures followed a standardized protocol for measuring both the TD and the Fick in all patients, this should, in principle, remove the subjectivity from the equation in 'selecting' patients to perform either of the two CO methods on.
There was not a consistent evaluation for potential cardiac or pulmonary shunts; however, we do not expect this to be a major contributor to the discrepancy we observed in this population (32), especially in the absence of a consistent directionality of the discrepancy.The original studies identifying tricuspid regurgitation as impacting TD were based on the volume of the tricuspid regurgitation, while echocardiographic velocity is a marker of pulmonary pressures and not amount of tricuspid regurgitation.We included TRJ and not tricuspid regurgitation volume in our analysis because we expected a significant amount of subjectivity in the reported assessment of the tricuspid regurgitation volume, especially in the setting of a retrospective study.Some might also consider that there are enough convincing data already available in the literature that strongly suggest that TD and indirect Fick have a relatively good correlation if properly performed.However, we believe that 'group' correlation between these two methods is inappropriate and may yield a false sense of individual patient correlation (11).In addition, many studies in the literature appropriately used direct Fick for comparison with either TD or indirect Fick CO, and clinicians have assumed that by substitution, contemporary indirect Fick and TD would be equivalent, which as our data show, is not necessarily true.
The commercialization of multiple CO measurement devices and the current trend of noninvasive measurements have introduced even more variation into clinical practice.In line with the United States National Institute of Health's relatively recent focus (33), such methods need to be validated in real-world populations rather than under ideal experimental conditions.We suggest that it may be time to return to directly measuring CO rather than relying on multiple layers of assumptions, at least in certain subpopulations.It is reassuring that the CO is not the only physiological measurement that we obtain during RHC on patients being evaluated for PAH.A significant proportion of our clinical decision making is also based on other measures such as right atrial pressure and central mixed venous oxygen saturation.

ConCluSIon
Our study showed significant differences between TD and indirect Fick in individual patients being evaluated for PAH.In the absence of a gold standard to measure CO or definite evidence to suggest superiority of either of the two most commonly used methods of CO determination, we recommend obtaining both TD and Fick CO data.Larger studies are needed to further refine the subgroups that need either or both methods performed, and possibly identifying the subgroup of patients that may need to have oxygen consumption directly measured in the catheterization laboratory to obtain a direct Fick CO.FundInG: Dr Fares had been funded by NIH grant # 5T32HL007106-34.dISCloSuReS: Dr Fares is the principal investigator on a research study funded by Gilead that is unrelated to this manuscript ($28,844 in unconditional grant support).Dr Ford, Dr Rosamond, Ms Blanchard, Dr Stouffer, Dr Chang and Dr Aris have no conflicts of interest related to this article to disclose.

APPendIX A
Cardiac output determination basis and assumptions TD determines the CO based on how fast a set amount of fluids (typically at room temperature) moves from the right atrium [injected into the proximal port of the pulmonary artery (PA) catheter] to the PA (detected by a thermo-sensor close to the tip of the distal part of the PA catheter).These data are entered into a computer software that makes a few generalized assumptions (see below) and then generates a CO, the TD CO.Fick CO, on the other hand, is based on the body's oxygen consumption divided by the arterio-venous differential oxygen content (21).Oxygen consumption, in general, is based on the metabolic rate, which among others, is dependent on age, sex, body surface area and heart rate.In real-world clinical practice, a standard oxygen consumption index (usually between 125 mL O 2 /min and 140 mL O 2 /min) is used based on a subjective estimate of the metabolic rate taking into consideration the above mentioned factors.The most commonly used value in our population was 133 mL O 2 /min.
Selected assumptions include the following: The patient who is about to undergo a procedure is at his/her • basal state of metabolism.
The CO is at a steady state during the different CO • measurements.
Lack of respiratory variation and thus lack of right ventricular • preload variation.The pulse oximeter is an accurate reflection of the arterial • saturation.

APPendIX B
Right heart catheterization procedure and measurements All technical aspects and methodologies for RHC and CO determinations were standardized and consistently followed.Oral sedation was sometimes used based on clinical setting.Most RHCs performed by the pulmonologists had a right internal jugular venous access placed under real-time ultrasound guidance, while RHCs performed by the cardiologists had either a femoral venous access or jugular venous access.The arterial oxygen saturation was obtained via a finger oximeter.No blood samples were obtained from the systemic arteries for arterial oxygen saturation measurements (except in a small subpopulation of the cardiology patients who underwent a left heart catheterization at the same time).Venous oxygen saturation was obtained by running a sample from the PA via an oximeter present in the catheterization laboratory.Hemoglobin level was typically checked on the same day of the procedure and it was again simultaneously checked with the PA saturation in the catheterization laboratory.
PA oxygen saturation was checked first, followed by at least a triplet of TD measurements (with satisfactory curves) using 10 mL of normal saline at room temperature.Any outlier measurement (ie, not within 10% to 15% of the other three TD measurements) was typically discarded.

Figure 1 )Figure 2 )
Figure 1) Bland-Altman plot of the difference between thermodilution (TD) and Fick cardiac output (CO) over mean of TD and Fick CO

Table 1 Patient demographics, characteristics, hemodynamic data and comorbidities (n=198)
Data presented as mean ± SD unless otherwise indicated.*Sevenpatients had mPAP <25 mmHg due to therapy and, thus, low pulmonary vascular resistance (PVR); † Thirty-one patients had mPAP >25 mmHg due to pulmonary venous hypertension (WHO group 2).CO Cardiac output; COPD Chronic obstructive pulmonary disease; LVEF Left ventricular ejection fraction; mPAP Mean pulmonary artery pressure; PAH Pulmonary arterial hypertension; PCWP Pulmonary capillary wedge pressureusing only the first measurement among those with multiple measurements did not change the results.Most of the RHCs were