A health technology assessment of transient elastography in adult liver disease

1Institute for Public Health, Department of Community Health Sciences; 2Division of Gastroenterology, Department of Medicine, University of Calgary, Calgary, Alberta Correspondence: Dr Fiona Clement, University of Calgary, 3rd Floor Teaching Research and Wellness Building, 3280 Hospital Drive Northwest, Calgary, Alberta T2N 4Z6. Telephone 403-210-9426, e-mail fclement@ucalgary.ca Received for publication July 27, 2012. Accepted August 12, 2012 An estimated one in 10 Canadians have some form of liver disease (1). In adults, liver scarring (ie, fibrosis) is commonly caused by the hepatitis B virus (HBV), hepatitis C virus (HCV), nonalcoholic fatty liver disease (NAFLD), cholestatic liver diseases and complications following liver transplantation (2). Over time, progressive fibrosis can lead to cirrhosis, in which hepatic blood flow becomes disrupted and liver function may become impaired. Cirrhosis can lead to portal hypertension, liver failure and hepatocellular carcinoma (HCC) (3). Cirrhosis and HCC are now among the top 10 causes of death worldwide, with cirrhosis being one of the top five causes of death in middleage populations in developing countries (4,5). Early diagnosis and an accurate assessment of a patient’s fibrosis stage are vital in establishing an effective course of treatment. Presently, the reference standard for the assessment of liver fibrosis is biopsy; however, there are risks associated with the procedure including pain, hemorrhagic complications and death (6). Transient elastography (TE) is an emerging ultrasound-based method for the staging of liver fibrosis (7). It is performed noninvasively and without the risks associated with liver biopsy (7). To date, no health technology assessment (HTA) evaluating the clinical and cost effectiveness of TE, compared with liver biopsy, has been conducted. The objective of the present study was to complete an HTA of TE compared with liver biopsy in adult patients with chronic liver disease. The present study included a synthesis of the clinical evidence and an economic evaluation to inform the optimal scope of use of TE in this patient population.

(Echosens, France) and "fibrosis" (see Appendix I for the detailed search strategy).

inclusion and exclusion criteria
Studies were included if the age of the sample population was older than 18 years of age, had liver disease, TE was used, liver biopsy was the comparator, a cohort study, the study reported test sensitivity and specificity or negative and positive predictive values, or if sufficient data were reported to calculate the aforementioned measures of diagnostic test performance.Liver histological results were required to be reported using the METAVIR or similar classification system.Studies were excluded if they were nonhuman, duplicate publications, preliminary reports, did not report sufficient data to formulate a contingency table, or if METAVIR or a similar system was not used.Language was restricted to English or French.

data abstraction
Data were extracted by two independent reviewers and any discrepancies were resolved by consensus.A standardized data abstraction form was used to collect information on the study population (age, sex, clinical condition and sample size), methods (randomized controlled trial [RCT] or cohort), interventions (TE with liver biopsy as the comparator), outcomes (reported in kilopascal [kPa] and/or fibrosis stage [F]) and complications.Included studies were assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) quality assessment tool (Appendix II Table 1).The QUADAS tool consists of 14 questions used to determine the quality and accuracy of studies included in systematic reviews of diagnostic accuracy (8).

statistical analysis
The three primary outcomes of interest were diagnostic test performance of TE for the differentiation of mild (F≤1) from moderate liver disease (F≥2), severe (F≥3) from moderate (F≥2), and cirrhosis (F=4) versus absence of cirrhosis (F≥3) compared with the reference standard of liver biopsy.Patients were classified based on reported fibrosis stage regardless of the kPa threshold used.Threshold values for each outcome were described using the mean, SD and range.The primary meta-analysis was an overall analysis of all liver disease etiologies.A subgroup analysis was conducted for each of the five clinical subgroups defined a priori: HBV, HCV, NAFLD, cholestatic liver disease and post-liver transplantation.Sensitivity and specificity scores were extracted from each study and synthesized using the summary ROC curve (sROC) with confidence and prediction contours.Summary estimates of sensitivity, specificity and area under the sROC (AUROC) were calculated.Diagnostic accuracy was graded as follows: excellent 0.9 to 1.0; strong 0.8 to 0.9; good 0.7 to 0.8; sufficient 0.6 to 0.7; poor 0.5 to 0.6; and test not useful <0.5.Statistical analysis was performed using the MIDAS program with Stata (StataCorp, USA), which estimates the summary statistics using an exact binomial rendition of the bivariate mixed-effects regression model (9,10).Heterogeneity was assessed using forest plots and Galbraith plots, and quantified using the I 2 statistic, which is defined as the percentage of total variation across studies attributable to heterogeneity beyond that from chance (11,12).Publication bias was assessed using funnel plots and Egger and Begg's test (13).Informed by the clinical literature, several potential sources of heterogeneity were examined including mean age, percentage of TE failures, mean body mass index, mean biopsy length, fibrosis prevalence, study size, year of publication and fibrosis stage threshold.An individual metaregression was completed for each of these parameters and those that were significant (P<0.10)were included in the multivariate model.Variables were manually entered in a stepwise approach and retained in the model if significant (P<0.05).

Cost-effectiveness analysis
A primary economic evaluation was completed using a simple decision model to assess the cost per correct diagnosis of TE compared with liver biopsy (Figure 1).According to this model, a patient would undergo either TE or liver biopsy.Fibrosis prevalence was used to represent the likelihood that the patient had liver fibrosis.Based on the diagnostic accuracy of TE, the patient was classified as a true positive, false positive, true negative or false negative.True positives and true negatives were considered to be correct diagnoses.In the base case scenario, patients who undergo TE do not continue to liver biopsy because the model only considers cost per 'correct' diagnosis.The impact of sequential liver biopsy was explored in a threshold analysis.

target population, comparators, perspective and time horizon
The economic model compared the number of correct diagnoses using TE versus liver biopsy.Following recommended guidelines, the perspective adopted was that of the health care payer (14).The time horizon was from screening to result of the test because only the cost per correct diagnosis was considered.The therapeutic and treatment outcomes for long-term care were not considered because it was unlikely that the use of TE or liver biopsy would affect these outcomes.No discounting was used due to the short time frame.The diagnostic accuracy and prevalence of fibrosis varies with each disease state; therefore, 15 target populations were identified: five clinical subgroups (HBV, HCV, NAFLD, cholestatic liver disease and liver transplant) combined with three fibrosis stages (F≥2, F≥3 and F=4).The input values varied with each subgroup.

Clinical inputs -diagnostic accuracy and fibrosis prevalence
The economic model assumed that the sensitivity and specificity of liver biopsy was 1.0 (perfect accuracy).Both the prevalence of fibrosis according to disease and the diagnostic accuracy of TE were informed by clinical meta-analysis.For each article, the prevalence of fibrosis was estimated by dividing the number of diseased by the total number tested.A weighted average was then calculated for each subgroup.The clinical meta-analysis provided the sensitivity and specificity for each subgroup.

Resource use and costs
All costs are reported in 2010 Canadian dollars.Costs were inflated using the Statistics Canada general consumer price index.Only direct health care costs were considered.Societal costs may differ between TE and liver biopsy.For example, there is an additional physician visit, prescreening bloodwork and additional time off work associated with liver biopsy.However, these costs were excluded from the analysis.Therefore, the overall cost of liver biopsy was underestimated.The cost of liver biopsy was $461.30based on the available Canadian literature (7).For TE, the cost of the device, annual maintenance costs and the physician cost were included.In the base case, the cost of the device is amortized over an anticipated lifetime of seven years, with an annual utilization rate based on the 2010 average of three Canadian centres performing liver biopsy (7).The cost of TE was estimated to be $99.44 based on the assumptions outlined in Appendix II Table 2. Finally, the economic model assumed that all liver biopsies and TE procedures would be completed within the existing infrastructure; therefore, no capital costs were included in the model (ie, cost of operating room for liver biopsy, cost of maintaining the operating room, cost of room for TE device, etc).

variability and uncertainty
Various sensitivity analyses were completed to explore the impact of the assumptions on the cost per correct diagnosis.The published Canadian cost of liver biopsy was substantially lower than that reported in other countries.Thus, the costs of liver biopsy were varied to represent costs in the United States and Europe.The cost of the ultrasound machine was amortized over five and, subsequently, 10 years to explore the impact of varying the lifetime of a TE device.The annual utilization of TE was varied to reflect the impact of increased utilization over time.In addition, a threshold analysis was conducted to determine the required likelihood of a patient undergoing liver biopsy after undergoing TE for TE to become the less economically attractive option (ie, the same cost as liver biopsy alone, but less clinically effective).Finally, because sensitivity, specificity and prevalence are linked concepts and cannot be varied independently, a probabilistic sensitivity analysis was performed.Normal distributions were used for each of the three variables and 95% CIs for the cost per correct diagnosis were reported.

Literature search
The literature search yielded 1753 abstracts, 130 of which were considered for full-text review.Fifty-seven articles were included for analysis (Figure 2).Table 1 provides an overview of the characteristics of each included study according to clinical condition.Most studies were of high quality, with 78% of studies scoring 14/14 using the QUADAS tool (Appendix II Table 1).The lowest score was 10/14.

Meta-analysis
The AUROC of TE according to fibrosis classification across all liver disease categories were 0.88 (95% CI 0.84 to 0.91) for F≥2 (n=45 studies), 0.92 (95% CI 0.89 to 0.94) for F≥3 (n=35) and 0.94 (95% CI 0.91 to 0.96) for F=4 (n=49) (Table 2).The sROC plots for each 0.9 0.9 0.9 0.9 F=4 27 111 10 0.9 0.9 0.7 1.0 Cho (31), 2011 86 51.7 46.5 F≥2 49 27 3 0.9 0.9 0.9 0.8 F≥3 28 48 7 0.9 0.9 0.8 0.9   The summary sensitivity and specificity estimates for TE compared with liver biopsy for each clinical condition and fibrosis stage are presented in Table 3 (an insufficient number of cholestatic liver disease studies were identified for meta-analysis).Diagnostic accuracy for F≥2 was good for HBV (sensitivity 0.77; specificity 0.72), HCV (sensitivity    For the two clinical conditions assessed in the F≥3 category (HBV and HCV), diagnostic accuracy was strong, with sensitivities of 0.83 and 0.88, and specificities of 0.81 and 0.91 respectively.The diagnostic accuracy for F=4 was sufficient for HBV (sensitivity 0.67; specificity 0.87) and strong to excellent for HCV (sensitivity 0.85; specificity 0.91) and NAFLD (sensitivity 0.92; specificity 0.95), respectively.In individual metaregression models, biopsy length, study size, year of publication and fibrosis stage cut-off were not statistically significant predictors of heterogeneity in any of the analyses.In the multiple metaregression model for the F≥2 subgroup, mean age (P=0.005) and percentage of failures (P=0.012) were simultaneously statistically significant predictors.In the F≥3 subgroup, only mean age was statistically significant (P=0.024)and, in the F=4 subgroup, no variables were significant at P<0.05.

economic evaluation
Cost-effectiveness results: Liver biopsy is more expensive, albeit more effective, than TE in all disease and fibrosis stage subgroups (Table 4).Because liver biopsy is considered to be the reference standard, the model assumed it correctly diagnosed 100% of patients (1000 of the 1000 hypothetical cohort).On average, liver biopsy costs an additional $362 per procedure than TE.The additional cost per correct diagnosis using liver biopsy compared with TE varied from $1,427 to $7,030 depending on the disease group considered.sensitivity analysis: One-way sensitivity analysis was completed on the cost of liver biopsy and TE.As the cost of liver biopsy increased, the cost per correct diagnosis increased.As the cost of TE increased due to either decreased utilization or decreased life span of the device, the cost per correct diagnosis of liver biopsy decreased.Similarly, as the cost of TE decreased, the cost per correct diagnosis of liver biopsy increased.However, none of the incremental cost-effectiveness ratios varied significantly with any of the variables explored.threshold analysis: In a scenario analysis, the likelihood of undergoing liver biopsy after TE was considered.If the probability of undergoing a liver biopsy -regardless of TE result -was greater than 78%, liver biopsy became the dominant option (ie, liver biopsy costs the same as TE, but gains greater clinical benefit).

Probabilistic sensitivity analysis:
The 95% CIs resulting from the probabilistic sensitivity analysis of sensitivity, specificity and prevalence of fibrosis are presented in Table 4.As expected, all three variables impact the resulting cost per correct diagnosis with wide CIs.Of note, the NAFLD F=4 (95% CI 509 to dominant) and cholestatic liver disease F=4 (95% CI 514 to dominant) included TE as the dominant option, meaning that it was less expensive than liver biopsy and equally as effective.The overall results of the meta-analysis suggest that TE, compared with liver biopsy, had summary sensitivities and specificities greater than 80%, with AUROC values close to 0.9 for all three fibrosis categories.
Although the results of the subgroup analysis were similar, most of the present research focused on HCV.There were an insufficient number of studies to assess the efficacy of TE in hepatitis A, cholestatic liver disease and for fibrosis stages F≥3 and F=4 in liver transplant; therefore, additional validation should be considered for these groups.Subgroup analyses indicated heterogeneity across the different disease categories and fibrosis stages.Metaregression indicated that mean age (P=0.005) and percentage of failures (P=0.012) were statistically significant predictors of heterogeneity in the F≥2 subgroup, whereas, in the F≥3 subgroup, only mean age was statistically significant (P=0.024)and, in the F=4 subgroup, no variables were significant at P<0.05.
The estimated cost of liver biopsy used in our models was $461 per procedure.This is an additional $362 per procedure when compared with TE.The additional cost per correct diagnosis using liver biopsy compared with TE varied from $1,427 to $7,030 depending on the subgroup considered.The results were robust to plausible variations in all variables considered.
Four meta-analyses and five scanning reports identified through our search reported findings similar to our own (7,(15)(16)(17)(18)(19)(20)(21)(22).However, the previous meta-analyses were limited by the subgroups considered and the date of the searches.Our work included five major clinical subgroups (HBV, HCV, NAFLD, cholestatic liver disease and post-transplantation) and the most current literature available.The present HTA was novel in that it assessed both the diagnostic accuracy of TE and its cost effectiveness.Previous work had focused on either the clinical effectiveness of TE or the economic value separately.The present analysis of the clinical application of TE compared with liver biopsy is consistent with previous systematic reviews: TE demonstrated strong diagnostic accuracy for F≥2 with an AUROC value of 0.88 (95% CI 0.84 to 0.91); and excellent diagnostic accuracy with AUROC values of 0.92 (95% CI 0.89 to 0.94) for F≥3 and 0.94 (95% CI 0.91 to 0.96) for F=4.
The diagnostic accuracy of TE for F≥2, F≥3 and F=4 makes it a costeffective alternative to liver biopsy.Liver biopsy costs $362 more per procedure than TE, with the cost per correct diagnosis ranging from $1,427 to $7,030 depending on the clinical condition.This cost savings was lost if more than 78% of TE procedures were followed up with liver biopsy.Furthermore, the cost effectiveness of TE was impacted by underutilization or if the lifespan of the TE device was less than seven years.
The present HTA does have some limitations.Despite the comprehensive search strategy that was used, we were limited by the available literature.An example of this is the preponderance of HCV studies; therefore, the validation of TE in other liver diseases, such as hepatitis A and cholestatic liver diseases, is required.Another potential limitation was that intention to treat was not assessed as a quality parameter; therefore, the results of some studies may have been biased toward patients with desired outcomes.The economic model, as with all models, was also limited by the available data.Of note was the use of observational data to inform the diagnostic accuracy and prevalence estimates.Ideally, these estimates would be taken from an RCT to minimize selection bias.However, in this case, an RCT is unlikely to be performed; hence, we were limited to cohort data.In addition, the economic model does not consider operational costs required to perform liver biopsies or TE (ie, operating room costs, nursing salaries, office space for gastroenterologists, etc).However, exclusion of these costs is likely to underestimate the cost of liver biopsy, making TE an even more economically attractive option.Furthermore, our model did not include societal costs or patient preferences.Again, these exclusions are likely to bias the results in favour of liver biopsy, which requires more patient time and is less preferable due to patient discomfort, risks and invasiveness.
Future research should consider investigating the efficacy of TE versus liver biopsy in monitoring fibrosis progression.The common practice in Alberta is to use TE to assess a patient with fibrosis every year, and liver biopsy every three to five years.If liver biopsy maintains its diagnostic accuracy, will TE still be considered the more costeffective option over longer-term horizons?

ConCLusions
TE is an accurate and cost-effective technology for diagnosis in patients with moderate fibrosis or cirrhosis.Although TE is less effective than liver biopsy, it is also less expensive, less invasive and safer than liver biopsy.Based on our results, systemic implementation of TE should be considered for the noninvasive assessment of liver fibrosis.

disCLosuRes: Supported by a financial contribution from Alberta
Health through the Alberta Health Technologies Decision Process: the Alberta model for health technology assessment and policy analysis.Alberta Health had no involvement in the design, data collection and interpretation of the findings.The views presented here do not represent the views of Alberta Health.Dr Myers is supported by salary support awards from the Canadian Institutes for Health Research and Alberta Heritage Foundation for Medical Research (now Alberta Innovates -Health Solutions).

Figure 1 )Figure 2 )
Figure 1) Decision model based on the diagnostic accuracy of transient elastography.Patients were classified as true positive (+), false positive, true negative (-) or false negative.True positives and true negatives were considered to be correct diagnoses

FN
False negative; FP False positive; NPV Negative predictive value; PPV Positive predictive value; ref Reference; S Sensitivity; Sp Specificity; TN True negative; TP True positive; yrs Years

*
Calculations did not converge; † Insufficient number of studies for analysis.HBV Hepatitis B virus; HCV Hepatitis C virus; NAFLD Nonalcoholic fatty liver disease disCussion

TABLE 4 Cost per correct diagnosis using liver biopsy compared with transient elastography (TE) Disease Fibrosis stage Correct diagnoses using TE (per 1000), n Incremental correct diagnoses using liver biopsy (per 1000), n Incremental cost per correct diagnosis using liver biopsy compared with TE $/correct diagnosis gained $/correct diagnosis gained (95% CI)
Search terms used to search other electronic databases and grey literature web sites will be derived and adapted from the MEDLINE search outlined above.

II TABLE 1 Quality assessment tool for diagnostic accuracy studies (QUADAS) First author (reference), year Was the spectrum of patients representative of the patients who will receive the test in practice? Were selection criteria clearly described? Is the reference standard likely to correctly classify the target condition? Is the time period between reference standard and index test short enough to be reasonably sure that the target condition did not change between the two tests? Did the whole sample or a random selection of the sample, receive verification using a reference standard of diagnosis? Did patients receive the same reference standard regardless of the index test result? Was the reference standard independent of the index test? Was the execution of the index test described in sufficient detail to permit replication of the test? Was the execution of the reference standard described in sufficient detail to permit its replication? Were the index test results interpreted without knowledge of the results of the reference standard? Were the reference standard results interpreted without knowledge of the results of the index test? Were the same clinical data available when test results were interpreted as would be available when the test is used in practice? Were uninterpretable/intermediate test results reported? Were withdrawals from the study explained?
CADTH Canadian Agency for Drugs and Technologies in Health; NHS National Health Service; SOMB Schedule of Medical Benefits APPENDIx II