The feasibility and reliability of transient elastography using Fibroscan ® : A practice audit of 2335 examinations The feasibility and reliability of transient elastography using Fibroscan ® A practice audit of 2335 examinations.

oBJeCTiveS: To examine the feasibility and reliability of LSM, and to identify patient and operator characteristics predictive of poorly reliable results. MeTHodS: The present retrospective study investigated the fre-quency and determinants of poorly reliable LSM (interquartile range [IQR]/median LSM [IQR/M] >30% with median liver stiffness ≥ 7.1 kPa) using the FibroScan (Echosens, France) over a three-year period. Two experienced operators performed all LSMs. Multiple logistic regression analyses examined potential predictors of poorly reliable LSMs including age, sex, liver disease, the operator, operator experience (<500 versus ≥ 500 scans), FibroScan probe (M versus XL), comorbidities and liver stiffness. In a subset of patients, medical records were reviewed to identify obesity (body mass index ≥ 30 kg/m 2 ). reSulTS: Between July 2008 and June 2011, 2335 patients with liver disease underwent LSM (86% using the M probe). LSM failure (no valid measurements) occurred in 1.6% (n=37) and was more common using the XL than the M probe (3.4% versus 1.3%; P=0.01). Excluding LSM failures, poorly reliable LSMs were observed in 4.9% (n=113) of patients. Independent predictors of poorly reliable LSM included older age (OR 1.03 [95% CI 1.01 to 1.05]), chronic pulmonary disease (OR 1.58 [95% CI 1.05 to 2.37), coagulopathy (OR 2.22 [95% CI 1.31 to 3.76) and higher liver stiffness (OR per kPa 1.03 [95% CI 1.02 to 1.05]), including presumed cirrhosis (stiffness ≥ 12.5 kPa; OR 5.24 [95% CI 3.49 to 7.89]). Sex, diabetes, the underlying liver disease and FibroScan probe were not significant. Although reliability varied according to operator (P<0.0005), operator experience was not significant. In a subanalysis including 434 patients with body mass index data, obesity influenced the rate of poorly reliable results (OR 2.93 [95% CI 0.95 to 9.05]; P=0.06). ConCluSionS: FibroScan failure and poorly reliable LSM are uncommon. The most important determinants of poorly reliable results are older age, obesity, higher liver stiffness and the operator, the latter emphasizing the need for adequate training.

Because numerous clinical decisions are made based on the results of liver stiffness measurement (LSM) using TE, it is vital to understand the reliability of this tool and factors that influence its accuracy. Traditionally, TE examinations with <10 valid measurements, a success rate <60% and/or a ratio of the interquartile range (IQR) of liver stiffness to the median value (IQR/M) >30% have been classified as unreliable (18). According to this definition, approximately 15% of LSMs are unreliable, with higher rates among older patients, women and patients with diabetes, hypertension, and higher body mass index (BMI) and waist circumference. Limited operator experience (<500 versus ≥500 examinations) may also influence reliability (18). However, this definition has been criticized because no studies have ever demonstrated that reliable results are more accurate than unreliable results. As such, Boursier et al (19) recently proposed a new definition of poorly reliable TE results (IQR/M >30% and median liver stiffness ≥7.1 kPa). In this study, poorly reliable LSMs -which occurred in 9% of patients -were less accurate than reliable LSMs according to multiple indicators of diagnostic test performance (19). The factors that influence the rate of poorly reliable results according to this definition have not been explored.
Accordingly, the objective of the present study was to examine the feasibility and reliability of TE using this revised definition of reliability in a large cohort of patients with various liver diseases and severities to reflect routine clinical practice. We aimed to identify patient and operator characteristics associated with poorly reliable results to better inform our interpretation of LSMs using TE.

Study population
In the present retrospective study, 2437 consecutive patients who underwent LSM using TE (FibroScan) at the University of Calgary Liver Unit (UCLU; Calgary, Alberta), between July 2008 and June 2011 were identified. The UCLU is the major referral centre for patients with liver disease who reside in Southern Alberta and serves a catchment population of approximately 1.5 million. LSM using TE is performed routinely in all patients who attend the UCLU without overt evidence of hepatic decompensation. In patients with multiple examinations, only the first LSM was considered to eliminate selection bias (ie, repeated examinations on patients who are easier to scan). For our analysis of the prevalence of FibroScan failure and reliability (see definitions below), the entire study cohort was examined. To identify predictors of poorly reliable LSM, only patients for whom linkage with Alberta administrative databases was possible (n=2335) were included to permit identification of the patients' underlying liver diseases and comorbidities. Therefore, for these analyses, 36 nonresidents of Alberta, 32 patients with invalid provincial health numbers and 34 patients for whom linkage with the administrative data were unsuccessful were excluded. These patients had demographic characteristics and liver stiffness results similar to the remainder (data not shown). The Conjoint Health Research Ethics Board at the University of Calgary approved the study protocol.

lSM
Two experienced operators (OP1 and OP2) performed all FibroScan examinations as per the manufacturer's recommendations. OP2 was employed after OP1 and performed her first LSM in September 2010. Between July 2008 and July 2009, the FibroScan M probe was used in all patients; thereafter, the FibroScan XL probe was used in obese patients (BMI ≥30 kg/m 2 ). Briefly, with the patient lying in the dorsal decubitus position and the right arm in maximal abduction, the tip of the FibroScan transducer probe was placed on the skin between the ribs over the right lobe of the liver. Assisted by a sonographic image, a portion of the liver at least 6 cm thick and free of large vascular structures was identified, and an attempt was made to collect at least 10 valid LSMs. The median liver stiffness value (in kPa) was considered to be representative of the elastic modulus of the liver. As an indicator of LSM variability, the IQR/M was calculated.

Administrative data sources
The present study used three Alberta administrative databases to identify the underlying liver disease etiologies and comorbidities of study participants via linkage using their unique personal health number. These databases have been used to examine the epidemiology (20)(21)(22), outcomes (22,23) and coding accuracy (20,(24)(25)(26)(27)(28) of a variety of medical conditions. 1. Physician Claims Database. This database includes claims submitted for payment by Alberta physicians for services provided to registrants of the Alberta Health Care Insurance Plan, a universal plan that covers >99% of Alberta residents (29). Each record in the database includes the service provided, the date and up to three diagnosis fields.  (34), a well-validated algorithm for predicting outcomes in patients with hepatic (35) and nonhepatic disorders (34,36). Liver diseases were excluded from this algorithm. Also examined were the type of FibroScan probe because use of the XL probe could be considered a surrogate marker for obesity, which is not available in the administrative databases. In a subset of patients, medical records were reviewed to extract the BMI on the day of the FibroScan examination.

Statistical analyses
Between-group comparisons were made using Fisher's exact and χ 2 tests for categorical variables, and Wilcoxon rank-sum and Kruskall-Wallis tests for continuous variables. Univariate logistic regression was used to identify predictors of poorly reliable FibroScan examinations including age, sex, underlying liver disease, liver stiffness, comorbidities, the FibroScan probe, the operator, operator experience and year. Variables that were significant in univariate analyses were included in stepwise-forward, multiple logistic regression models in which variables with P<0.1 were retained in the models. In the subset of patients with available BMI data from medical record review, an additional multivariate analysis was performed including BMI. All analyses were performed using Stata version 11.0 (StataCorp, USA); a twosided P<0.05 was considered to be statistically significant.

Patient characteristics
A total of 2335 patients underwent LSM using TE at the UCLU between July 2008 and June 2011 and met the study inclusion criteria; their characteristics are outlined in Table 1. The median age was 50 years (IQR 40 to 58 years) and 56% were male. The majority of patients had chronic hepatitis C virus (36%) or HBV (27%) infection, while 7% had nonalcoholic fatty liver disease and 6% had autoimmune liver disease. Among 444 patients with BMI data available from medical record review, the median BMI was 25.5 kg/m 2 ; 25% of patients were obese (BMI ≥30 kg/m 2 ). Twelve percent of the cohort had a history of diabetes mellitus, 34% had hypertension, 35% had depression, and 13% and 10% had a history of alcohol and drug abuse, respectively.

lSM results
The majority of LSMs (86%) were obtained using the FibroScan M probe, 47% were performed by OP1 and 76% were performed during

FibroScan reliability
According to the reliability criteria of Boursier et al (19) and excluding LSM failures, FibroScan examinations were classified as very reliable in 29% (n=659), reliable in 66% (n=1526) and poorly reliable in 4.9% (n=113) of patients (Table 1). According to previously recommended criteria for reliability (18), 15% of these examinations (n=343) would have been classified as unreliable (valid shots <10, success rate <60% and/or IQR/M >30%). Based on the updated definitions of Boursier et al (19), 41% of these 'unreliable' examinations would have been classified as very reliable, 22% as reliable and 37% as poorly reliable. . Using the XL probe, 5.6% (two of 36) of obese patients had poorly reliable results versus 13% (nine of 70) measured using the M probe (P=0.33). The combined influence of obesity and presumed cirrhosis on the risk of poorly reliable LSM according to FibroScan probe is illustrated in Figure 1. In patients with obesity and cirrhosis, the risk of poorly reliable results was high (24% to 25%) with both probes compared with only 0% to 13.3% in patients with none or only one of these risk factors. P=0.08), operator experience, year of LSM and patient sex were not statistically significant. In a supplementary analysis, in which liver stiffness was categorized rather than examined as a continuous variable, presumed cirrhosis was associated with a fivefold higher risk of poorly reliable results (OR 5.28 [95% CI 3.50 to 7.95]); the remainder of the results were largely unchanged (data not shown). Finally, in an analysis including BMI from medical record review, obesity was associated with nearly threefold odds of poorly reliable results (OR 2.93 [95% CI 0.95 to 9.05]; P=0.06) after adjustment for FibroScan probe and the other factors described above.  [18]) were observed in 15% of patients. This difference is relevant because the latter 'unreliable' results have never been shown to be less accurate than reliable results. In fact, use of this outdated definition of reliability may have led to the needless discarding of approximately 10% of Fibroscan results, with potentially important implications in clinical practice and in research studies. Importantly, approximately two-thirds of these results would have been classified as very reliable or reliable according to the revised definitions. The prevalence of poorly reliable results observed in our study is slightly lower than that of Boursier et al (19) (4.9% versus 9.1%). This likely reflects differences in the study populations, particularly the lower prevalence of advanced fibrosis in our study (median liver stiffness 6.3 kPa versus 8.1 kPa in Boursier et al [19]) because higher liver stiffness (≥7.1 kPa) is a criterion in the definition for reliability (see below). Moreover, liver stiffness was measured using the XL probe in 25% of our patients, whereas only the M probe was used in the French study. Because liver stiffness measured with the XL probe is consistently lower than with the M probe, a lower rate of poorly reliable results could also be anticipated in our study. Because TE is increasingly being used in clinical decision making, it is important to understand factors that influence its reliability because poorly reliable results are less accurate than reliable examinations. Specifically, in the study by Boursier et al (19), poorly reliable results were only 70% accurate for the diagnosis of cirrhosis compared with 86% and 90% in patients with reliable and very reliable results, respectively. Corresponding AUROCs for cirrhosis were 0.82, 0.90 and 0.97, respectively. With these facts in mind, we examined several patient-and operator-related characteristics as potential predictors of unreliable examinations. As previously reported, older age was associated with an increased risk of unreliable LSM (18). The exact reasons for this finding have never been identified; however, we speculate that age-related alterations in the chest wall are involved. It is known that chest wall compliance decreases with age due to structural changes of the intercostal muscles, intercostal joints and rib-vertebral articulations. In addition, age-associated osteoporosis may increase kyphosis, resulting in changes in the geometry of the thorax (37). On a related note, we identified a 1.6-fold increase in the risk of poorly reliable results among patients with chronic lung disease, predominantly chronic obstructive pulmonary disease (data not shown). This novel finding may also relate to structural changes in the chest wall (eg, pulmonary hyperinflation) or technical difficulties with the FibroScan procedure due to deep respirations in these patients. Because an increased risk of LSM failure or unreliable results has not been reported in cohorts with cystic fibrosis (38), additional studies are necessary to confirm this finding and to elucidate potential mechanisms.

Figure 1) Prevalence of poorly reliable FibroScan (Echosens, France) examinations according to obesity (body mass index [BMI] ≥30 kg/m 2 ), presumed cirrhosis (F4; liver stiffness ≥12.5 kPa) and FibroScan probe (M versus XL). This analysis is limited to 434 patients with available BMI data and successful liver stiffness measurement (ie, failures excluded)
Although previous studies have reported that women have a higher rate of unreliable FibroScan examinations compared with men (18), we did not observe a significant impact of sex using the updated definition of reliability. As previously reported (18), obesity was associated with a nearly threefold risk of unreliable results in a subanalysis of patients with available BMI data. The importance of obesity as a predisposing factor for poorly reliable results is supported by the borderline effect of XL probe use (P=0.08), considered a surrogate marker for obesity in the absence of BMI data in all patients. Presumably, subcutaneous and prehepatic adipose tissue in obese patients interferes with transmission of the mechanical shear wave and/or the measurement of its propagation by the FibroScan device (7).
Previous studies by Lucidarme et al (39) and Myers et al (40) have demonstrated an impact of fibrosis stage on the rate of discordance between fibrosis estimated by LSM and liver biopsy; however, an impact on poorly reliable results has not been reported. In the current study, elevated liver stiffness was an independent predictor of poorly reliable LSM. Presumed cirrhosis was the most important risk factor, with fivefold higher odds of poorly reliable results in cirrhotic patients. This finding is not surprising because an LSM ≥7.1 kPa is one criterion in the definition of poor reliability (19). The other factor, IQR/M, also likely played a role because LSM variability tends to be greater in patients with cirrhosis (data not shown). The reason for this is unclear, but may relate to the broader range of potential LSMs in cirrhotic patients (ie, approximately 12.5 kPa to 75 kPa) compared with those who have lower liver stiffness (ie, 2.5 kPa to 12.4 kPa) (31). Interestingly, coagulopathy was associated with a twofold risk of poorly reliable results. Because there is no clear physical explanation for this finding, we suspect it reflects more severe liver disease and, therefore, higher liver stiffness in coagulopathic patients. The majority of these patients had diagnosis codes for thrombocytopenia as opposed to hereditary or acquired coagulation defects (data not shown). Because the effect of coagulopathy on poorly reliable results was independent of liver stiffness, it likely reflects the imperfect sensitivity of FibroScan for the diagnosis of cirrhosis.
A strength of our study was our analysis of the impact of specific liver conditions and comorbidities on FibroScan reliability using administrative data. In addition to uncovering a novel association between chronic lung disease and poorly reliable results, this approach revealed several other associations. First, patients with HBV had a lower likelihood of unreliable results. Because HBVinfected patients in our practice tend to be Asian and of smaller stature, this finding is not unexpected. In fact, we previously reported no FibroScan failures with the M probe in a study examining the value of the pediatric (S2) probe in this patient population (41). Second, hypertension and diabetes, components of the metabolic syndrome that have been associated with unreliable LSMs in previous studies (18), were significant in unadjusted, but not adjusted, analyses. Similarly, patients with congestive heart failure or arrhythmias had a twofold higher risk of poorly reliable LSM in univariate analyses. These findings may relate to hepatic congestion due to cardiac dysfunction, a well-described cause of liver stiffness overestimation (42). Because the power of our multivariate analysis may have been limited due to a small number of poorly reliable results (n=113), studies that use this novel definition of reliability in larger patient populations will be necessary to confirm or refute these findings.
In addition to patient-related predictors of FibroScan reliability, we examined the impact of procedural characteristics including the operator and operator experience. Castera et al (18) reported a lower risk of unreliable results among seven operators who had performed at least 500 examinations, challenging previous assertions that that a novice can consistently obtain reliable results after a short training period of only 50 examinations. In our study, however, there was no difference in the proportion of poorly reliable results between the first 500 and subsequent examinations by our two operators. On the contrary, OP1 was twice as likely to produce a poorly reliable LSM as OP2. This finding supports the importance of adequate operator training and ongoing quality control when using the FibroScan in clinical decision making. It is important to note, however, that these results were confounded by the availability of the FibroScan XL probe only during the final two years of the study. Therefore, OP1 -who was employed before OP2scanned many obese patients using the M probe during the early part of the study. This likely led to an overestimation of her true rate of unreliable results. In fact, OP1 was twice as likely to have FibroScan failure as OP2, presumably for the same reason.
Our study has several limitations that warrant discussion. First, we did not have histological data to confirm whether poorly reliable LSMs were less accurate for staging fibrosis than reliable examinations. Second, because this was a retrospective study, we did not prospectively collect data regarding BMI and other anthropometric measures (eg, waist circumference, thoracic perimeter, skin-capsular distance) that may have influenced FibroScan reliability. Similarly, we relied on administrative data to define patient comorbidities and hepatic diagnoses. Although many of these codes have been validated by medical record review (20,(24)(25)(26)(27)(28), additional validation is necessary. diSCloSureS: Dr Myers was supported by a salary support award from the Canadian Institutes for Health Research (CIHR). Dr Kaplan is supported by salary support awards from the CIHR and Alberta Innovates-Health Solutions (AI-HS). Dr Swain is supported by the Cal Wenzel Family Foundation Chair in Hepatology. This study was supported, in part, by grants from AI-HS, CIHR (#84371) and the Canadian Liver Foundation. This study was based, in part, on data provided by Alberta Health. The interpretation and conclusions contained herein are those of the researchers and do not necessarily represent the views of the Government of Alberta. Neither the Government nor Alberta Health express any opinion in relation to this study. The authors have no financial disclosures or conflicts of interest to declare.