Survival after locoregional treatments for hepatocellular carcinoma: a cohort study in real-world patients.

Evidence of relative effectiveness of local treatments for hepatocellular carcinoma (HCC) is scanty. We investigated, in a retrospective cohort study, whether surgical resection, radiofrequency ablation (RFA), percutaneous ethanol injection (PEI), and transarterial embolization with (TACE) or without (TAE) chemotherapy resulted in different survival in clinical practice. All patients first diagnosed with HCC and treated with any locoregional therapy from 1998 to 2002 in twelve Italian hospitals were eligible. Overall survival (OS) was the unique endpoint. Three main comparisons were planned: RFA versus PEI, surgical resection versus RFA/PEI (combined), TACE/TAE versus RFA/PEI (combined). Propensity score method was used to minimize bias related to non random treatment assignment. Overall 425 subjects were analyzed, with 385 (91%) deaths after a median followup of 7.7 years. OS did not significantly differ between RFA and PEI (HR 1.11, 95% CI 0.79-1.57), between surgery and RFA/PEI (HR 0.95, 95% CI 0.64-1.41) and between TACE/TAE and RFA/PEI (HR 0.88, 95% CI 0.66-1.17). 5-year OS probabilities were 0.14 for RFA, 0.18 for PEI, 0.27 for surgery, and 0.15 for TACE/TAE. No locoregional treatment for HCC was found to be more effective than the comparator. Adequately powered randomized clinical trials are still needed to definitely assess relative effectiveness of locoregional HCC treatment.


Introduction
Locoregional treatments are the mainstay of treatment of early stage hepatocellular carcinoma (HCC) [1][2][3]. Surgical resection should be particularly considered for patients with solitary tumours and well-preserved liver function. Transarterial embolization with (TACE) or without (TAE) chemotherapy is recommended for intermediate stage HCC pa-2 The Scientific World Journal tients who are ineligible for surgery or percutaneous ablation [1][2][3]. However evidence of relative effectiveness on survival of local treatments for hepatocellular carcinoma (HCC) is scanty due to the paucity of clinical trials and shortness of followup.
A meta-analysis [4] that compared surgery versus ablative treatments in the subgroup of tumors >3 cm found a survival benefit of surgery, but concluded that level of evidence was low and that further RCT were needed.
Two meta-analyses [5,6], compared radiofrequency ablation (RFA) with PEI, and found a slight survival improvement with RFA over PEI, but relied upon few small trials with a short followup period and a limited number of events.
Three meta-analyses [7][8][9] assessed the efficacy of TACE/TAE versus supportive care, but ended up with contrasting results. Geschwind et al. failed to show a survival advantage versus supportive care alone and emphasised the poor quality of published trials [7]. Cammà et al. claimed that both TACE/TAE significantly reduced overall 2-year mortality, but the magnitude of benefit was relatively small [8]. Llovet and Bruix found that arterial embolization improved 2-year survival versus control, and this benefit was significant for TACE but not for TAE [9].
In general, these studies were conducted in specialized reference centers in well-selected patients. In this observational cohort study we assessed the relative effectiveness on long-term survival of locoregional treatments for HCC in real-world patients.

Study Subjects.
The study had a retrospective cohort design. All patients first diagnosed with HCC (ICD-9 155.0) and treated with any locoregional therapy from 1998 to 2002 in public hospitals of Campania, southern Italy, were potentially eligible. Potential patients were retrieved from the Discharge Information System of the Regional Health Service; eligibility criteria were subsequently checked by perusing clinical records. Child-Pugh score C, presence of portal vein thrombosis, massive tumour morphology and liver involvement greater than 50% were exclusion criteria. Time interval was chosen "a priori" to allow an adequate followup.
The study protocol was approved by the ethic committees of all the participating Institutions.

Endpoint and Covariates.
Overall survival (OS) was the unique outcome measure and was defined as the time from the date of the first local intervention until death for any cause or until date of last followup. Date of death was ascertained by the administrative registry offices of patients' towns of residence.
Baseline demographic, clinical, and tumour-related variables were derived from clinical records. The CLIP prognostic score [10,11], used for statistical adjustment, was calculated "a posteriori" from information reported in clinical records. Performance status was very rarely reported so that the Barcelona Clinic Liver Cancer [3] staging could not be assessed. . Three main comparisons were planned, RFA versus PEI, surgery versus RFA or PEI, TACE/TAE  versus RFA or PEI. To minimize biases related to nonrandom assignment we used the propensity score method [12][13][14], where the relationship between treatment and survival is adjusted for patient's likelihood of receiving that therapy given his/her prognostic profile.

Statistical Analysis
For each comparison the primary multivariable analysis was performed by a Cox proportional hazard model with compared treatments and propensity score as covariates, stratified by the number of missing values in the CLIP score components. Propensity score was estimated for each comz parison by a logistic regression model that included, as covariates, age, sex, CLIP prognostic score, and number of missing components of the CLIP score [15]. Only subjects with overlapping values of propensity score were analysed for each main comparison. Proportional hazard assumption was checked by graphical inspection [16].
As a sensitivity analysis, further statistical models were performed to assess the consistency of results [17][18][19][20]: (i) modelling propensity score with cubic regression splines in order to obviate the need for assuming a linear effect [17], (ii) stratifying the model by subclasses defined by propensity score quintiles [18], (iii) weighting each subject by the inverse of the individual probability of receiving the treatment assumed, estimating variance via the empirical sandwich method [19], and (iv) substituting CLIP score in the Cox model with its components [20].
Unadjusted cumulative survival curves were depicted by Kaplan-Meier (K-M) method and compared by the Mantel-Haenszel test (MH) and Peto and Peto modification of the Wilcoxon rank sum test (WPP). The two tests give different weights to events, the second one giving more weight to earlier events.
Since guidelines [1,2] suggest that treatments could have different effects in particular subgroups of subjects, for each comparison we repeated analyses in predefined subgroups of subjects.
All analyses were performed with R software, version 2.9.1 (Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. 2009).

Results
Overall 441 HCC patients discharged from January 1998 and December 2002 were eligible. Sixteen cases were excluded because of lack of any follow-up information, thus the final study sample involved 425 subjects. Baseline characteristics of the 425 patients are reported in Table 1. PEI was the most common treatment (60%) followed by TACE/TAE (19%), while surgical resection and RFA were performed in fewer subjects. Three patients received at the same time both PEI and RFA and were excluded only from the comparison of PEI versus RFA; eight patients received other local treatments (laser therapy) and were excluded from all comparisons. On the whole, prognostic factors did not differ a lot among treatments, although seemingly surgery was performed in Table 1: Baseline characteristics of the study patients by treatment. Data are reported as absolute numbers (percentages), but for age and AFP.
In Table 2 we reported results of the three multivariable primary analyses. No significant difference in overall survival was found for any of the three planned comparisons.
HR of surgery versus the two percutaneous ablation therapies combined was equal to 0.95 (95% C.I. 0.64 to 1.41,  (Figure 1). HR of TACE/TAE versus the two percutaneous ablation therapies combined was equal to 0.88 (95% C.I. 0.66 to 1.17, P = 0.38). Estimated probabilities to be alive at 5 years were equal to 0.15 (95% CI 0.10 to 0.24) and 0.17 (95% CI 0.13 to 0.22) for TACE/TAE and RFA/PEI, respectively. At the univariate analysis MH test did not reveal differences between arms (P = 0.44) while WPP test (P = 0.03) was statistically significant (Figure 1), thus highlighting an earlier prognostic advantage for RFA/PEI that later disappeared.
Superimposable results were found for all comparisons at sensitivity analyses, where other adjustment modalities were applied ( Table 3).
Results of univariate analyses in predefined subgroups of subjects for the three comparisons are reported in Figure 2. For every comparison, results in the study subgroups were similar to the overall analyses without any evidence of hete rogeneity.

Discussion
This observational study in a clinical practice setting did not find survival differences between local treatments in any of the study comparisons. Clearly, robust evidence of treatment efficacy may only result from adequately powered rando-The Scientific World Journal 5  In this study we addressed the potential biases of observational studies in several ways. First, we pursued a populationbased approach, identifying the reservoir of potentially eligible patients from an independent source (the Discharge Information System of the Campania Regional Health Service), thus reducing the risk of selection bias. In addition we 6 The Scientific World Journal chose survival as the unique endpoint of the study, as recommended when effectiveness between therapies is assessed [1,2]. To remove the ascertainment bias, the date of death was independently derived from the administrative death registries. We were unable to assess the outcome only in 16 subjects because of migration or mistakes in residence information. Finally, we counteracted indication bias (i.e., patients' selection for different therapies) by adjusting comparisons for propensity score [12][13][14], (i.e., the probability of receiving a given therapy conditionally on the patient's individual prognostic profile).
A major strength of our study is the length of followup with a large number of deaths observed (91% of the whole sample), that allowed a complete picture of the survival experience of the study cohort. To our knowledge, our cohort is the largest reported in the literature for this kind of study, after the one of Arii et al. who used a population-based approach starting from a nationwide survey in Japan [21]. Furthermore, from a methodological viewpoint, we assessed whether results persisted under possible violations of the statistical assumptions, by repeating the analyses with several adjustment modalities. The consistency of results across different models reinforces their validity, although some residual confounding could still be present, due to unknown covariates not included in the models [22,23].
We adjusted for missing information in multivariable analyses, but we acknowledge that missing data might partially affect our findings. Furthermore we only assessed first-line local treatments, since information on successive treatments was largely unreliable.
The major and unexpected finding of our results was the lack of significant differences even in univariate analyses, where we expected survival differences at least as a consequence of indication bias. Actually patients' baseline characteristics overlapped substantially among treatments, despite the careful selection recommended by the international guidelines [1,2]. Although this might be partly explained by the fact that our cohort was antecedent to guidelines, an alternative explanation is that the choice of local treatment was rather driven by clinicians' preferences or availability of skills.
Although our results may appear surprising, they mirror some uncertainties of the literature results. Two metaanalyses [5,6] analyzed the comparisons of RFA versus PEI and found a significant survival improvement favouring the former over the latter one, while a systematic review on the same trials concluded that data does not provide enough evidence to support survival benefits coming from RFA [24].
The five randomized trials that tested the two percutaneous treatments and were considered in the meta-analyses [5,6] were small and had a short follow-up. Interestingly, we did not find any difference between RFA and PEI even in the subgroups of patients (like those with larger tumor size) in which international guidelines claim that RFA should be more effective than PEI [2].
Surgery has been compared to percutaneous ablation in three small randomized trials [25][26][27]. Huang et al. [25] and Chen et al. [26] did not find significant benefits of surgery, while Huang et al. [27] found that surgical resection increased overall survival in patients who met the Milan criteria. A meta-analysis [4] that included only one randomized trial and several observational studies, found a survival benefit of surgery versus ablative treatments in the subgroup of tumors >3 cm, but concluded that level of evidence was low and that further RCT were needed to define the relative value of surgery and RFA. Unfortunately in our study the number of surgical resections, that were performed only in two big Institutions, is small and comparison is underpowered. However we did not find any difference even in the subgroups of patients with single nodules or Child-Pugh A, that is, the best candidates to resection [2].
To our knowledge TACE/TAE alone have never been compared with other locoregional treatments since guidelines consider TACE/TAE as restricted to 'nonsurgical HCC that are also ineligible for percutaneous ablation [2]. As expected, we found slightly worse patients in the TACE/TAE group, but in multivariable analysis we were unable to find any difference in long-term survival from RFA/PEI, neither overall nor in selected subgroups.
In conclusion, although our approach does not allow definitive statements, our results show that, in a real-world setting, uncertainties in the choice and in the outcome of local treatments of HCC are still present. Educational projects and population-based observational studies, supported by well-planned RCTs, are still needed to define the relative effectiveness of locoregional treatments.