Prospective study of biliary strictures to determine the predictors of malignancy

Can J Gastroenterol Vol 14 No 5 May 2000 397 Departments of Medicine, Public Health Services and Radiology, Walter Mackenzie Health Sciences Centre, University of Alberta, Edmonton, Alberta Correspondence and reprints: Dr Vincent G Bain, University of Alberta, 2E1.14 Walter Mackenzie Health Sciences Centre, 8440 – 112 Street, Edmonton, Alberta T6G 2R7. Telephone 780-407-7238, fax 780-439-1922, e-mail vince.bain@ualberta.ca Received for publication March 1, 1999. Accepted November 30, 1999 ORIGINAL ARTICLE

BACKGROUND: There have been few prospective studies regarding the investigation of biliary strictures, principally because of rapid technological change.The present study was designed to determine the sensitivity of various imaging studies for the detection of biliary strictures.Serum biochemistry and imaging studies were evaluated for their role in distinguishing benign from malignant strictures.METHODS: Thirty-one patients with suspected noncalculus biliary obstruction were enrolled consecutively in the study.A complete biochemical profile, ultrasound, Disida scan and cholangiogram (endoscopic retrograde cholangiopancreatography [ERCP] or percutaneous cholangiogram) were obtained at study entry.Stricture etiology was determined based on cytology, biopsy and/or clinical follow-up at one year.RESULTS: Twenty-nine of 31 patients had biliary strictures, of which 15 were malignant.The mean age of the malignant cohort was 73.9 years versus 53.9 years in the benign cohort (P<0.001).Statistically significant differences between the malignant and benign groups, respectively, were as follows: alanine transaminase 235.2 versus 66.9 U/L (P=0.004),aspartate transaminase 189.8 versus 84.5 U/L (P=0.011),alkaline phosphatase 840.2 versus 361.1 U/L (P=0.002),bilirubin 317.8 versus 22.1 µmol/L (P<0.001) and bile acids 242.5 versus 73.2 µmol/L (P=0.001).Threshold analysis using receiver operative characteristic (ROC) curves demonstrated that a bilirubin level of 75 µmol/L was most predictive of malignant strictures.Intrahepatic duct dilation was present in 93% of malignant strictures versus 36% of benign strictures (P=0.002).Common hepatic duct dilation was less discrimi-natory (malignant 13.5 versus benign 9.6 mm; P=0.11).Ultrasound was highly sensitive (93%) in the detection of the primary tumour in the bile duct or pancreas, or in the visualization of nodal or liver metastases.In benign disease, ultrasound failed to detect evidence of intrahepatic or extrahepatic biliary dilation in most cases.Disida scans were not able to distinguish between malignant or benign strictures and could not accurately localize the level of obstruction.The sensitivity of Disida scan for the diagnosis of obstruction was 50%.Cholangiographic characterization of strictures revealed an equal distribution of smooth (eight of 13) and irregular (five of 13) strictures in the malignant group.Ten of 13 benign strictures were characterized as smooth.Malignant strictures were significantly longer than benign ones -30.3 versus 9.2 mm (P=0.001).Threshold analysis using ROC curves showed that strictures greater than or equal to 14 mm were predictive of malignancy (sensitivity 78%, specificity 75%, log odds ratio 11.23).CONCLUSIONS: A serum bilirubin level of 75 µmol/L or higher, or a stricture length of greater than 14 mm was highly predictive of malignancy in patients with a biliary stricture.Ultrasound was useful in predicting malignant strictures by detecting either intrahepatic duct dilation or by visualizing the tumour (primary or metastases).Strictures with a 'benign' cholangiographic appearance are frequently malignant.Disida scan did not add additional information.ERCP is necessary to diagnose benign strictures, which tend to be less extensive at presentation.(Pour le résumé, voir page suivante) Key Words: Bile duct obstruction; Biliary tract neoplasms; Cholangiopancreatography; Cholestasis; Endoscopic retrograde cholangiopancreatography; Jaundice; Pancreatic neoplasms; Prospective studies B iliary strictures are a challenging problem for the clini- cian.By the time that patients with biliary strictures are referred to a specialist, the diagnosis is usually already known or strongly suspected because clinical evaluation and noninvasive investigations alone have a high specificity and sensitivity (1)(2)(3)(4).The job of the medical or surgical specialist is not only to confirm the diagnosis of biliary stricture but also, importantly, to define the etiology and the exact anatomic location, which is vital to therapeutic planning.The differentiation between benign and malignant strictures can be difficult but is of obvious importance in regard to prognosis and optimal therapy.Numerous imaging modalities are available for the investigation of biliary strictures, including abdominal ultrasound, computed tomographic (CT) scanning, nuclear imaging, percutaneous transhepatic cholangiography (PTC), endoscopic retrograde cholangiopancreatography (ERCP) and most recently magnetic resonance cholangiopancreatography (MRCP).Comparative and descriptive studies in this area are lacking, primarily because rapid technological improvements and developments outdate them.We, therefore, embarked on a prospective descriptive trial with the following aims: • Determine the predictive value of liver enzymes, serum bilirubin, serum bile acids, ultrasound and diethyl-iminodiacetic acid (Disida) nuclear imaging for the presence of malignant biliary strictures.
• Measure the ability of ultrasound and nuclear imaging to localize the level of obstruction using direct cholangiography as the gold standard.
• Determine the sensitivity of abdominal ultrasound and Disida nuclear scanning for the detection of biliary strictures.• Investigate the utility of various cholangiographic features to distinguish malignant from benign strictures.

PATIENTS AND METHODS Patients:
All patients with biliary strictures referred to the Division of Gastroenterology at the University of Alberta Hospitals for investigation between January 1, 1995 and December 31, 1995 were prospectively entered into the trial.The inclusion criteria were age 16 years or older and noncalculus biliary obstruction.Patients were excluded if subsequent evaluation did not show a stricture.Ethics committee approval was obtained.Protocol: The following information was obtained: • Clinical history and careful physical examination.• Abdominal ultrasound examination with particular to intrahepatic biliary dilation, extrahepatic duct calibre, presence or absence of gallbladder and other relevant pathology such as tumour mass or ductal stones.
• Disida scan.Patients were examined after a 4 h fast.
Opiates were withheld for the proceeding 24 h.In addition to the standard scan, data were collected for deconvolutional analysis to determine hepatic extraction fraction and time activity curve so that the half-life of biliary excretion and time to peak activity could be analyzed.
• Cholangiography.ERCP was attempted first in all patients with failures proceeding to PTC.Cefazoline 1 g was administered intravenously 30 to 60 mins before cholangiography.The biliary system was filled as completely as possible using 50% Conray 60 (Mallenchrodt, St Louis, Missouri) contrast injected under low pressure.The information obtained from each cholangiogram included site of stricture, multiplicity, character (smoothly tapered versus irregular or shouldered), stricture(s) length, minimal stricture width, maximal proximal biliary dilation and other information (ampullary mass, primary sclerosing cholangitis, cancer of the pancreas).
All data were to be collected within five working days so that the different imaging modalities tested would be comparable.All imaging studies were interpreted by radiologists blinded to the results of the patients' other diagnostic studies.The ERCP data were obtained last so that a biliary stent could be inserted if indicated.The cholangiographic measurements were confirmed by two independent observers.Stricture etiology was defined by cytology or biopsy histology or by clinical outcome after one year.Statistical analysis: Statistical analysis was performed using SPSS.Between-groups differences in mean values of continuous variables were tested by independent samples t tests or by nonparametric Mann-Whitney Rank Sum tests when the distributions were not normal.The differences in frequencies of categorical variables were tested by c 2 test with Yates' correction for continuity or by Fisher's exact test when the expected number of observations per cell was less than five.Associations between continuous variables were assessed by Pearson correlation coefficient.
Logistic regression analysis was used to analyze the association of dichotomous outcome variable (malignant versus benign) with continuous and categorical predictor variables.The statistical inferences were based on the level of significance P<0.05.Receiver operating characteristic (ROC) curves were constructed for the biochemical variables.
To determine optimal threshold levels for each diagnostic parameter, ROC plots were constructed using the observed true and false positive rates at each potential threshold level.A best fit ROC curve was constructed according to methods published elsewhere (5,6).The threshold value providing the best compromise between true and false positive rates was estimated from the ROC plot.Likelihood ratios were calculated from the fitted ROC curve.

RESULTS
Thirty-one patients were enrolled in the study.Two patients were excluded -one because he did not have a stricture and one whose suspected stricture was unevaluable because of previous biliary bypass and contraindication for PTC as a result of coagulopathy.Of the remaining 29 patients, 15 were diagnosed with malignant strictures and 14 with benign strictures.Two patients had primary sclerosing cholangitis, both of whom had multiple strictures.Patient demographics and underlying diagnosis are shown in Table 1 (sex, age and diagnosis).The mean age of the malignant cohort was 73.9 years versus 53.9 years for the benign cohort (P<0.001).Fifty per cent of the strictures were malignant in females versus 53% in males (not significant).
Can J Gastroenterol Vol 14 No 5 May 2000 399 Prospective study of biliary strictures Biochemistry: Mean serum values for ALT, AST, AP, biliruand BA were significantly higher in the malignant stricture group than in the benign stricture group (Figure 1).The most striking difference between the two groups was in the serum bilirubin levels -317.8±43µmol/L (SEM) versus 22.1±5 µmol/L (P<0.001) for the malignant versus the benign group.
To examine the clinical utility of these biochemical differences, ROC curves were constructed for each variable.Threshold values and likelihood ratios are provided in Table 2. Ultrasound: Intrahepatic duct dilation, defined by a visible lumen within the intrahepatic ducts, was detected in 19 of the 29 strictures (66%).This finding was observed in 93% of malignant strictures versus 36% of benign strictures (P=0.002,Fisher's exact test).Common hepatic duct dilation tended to be greater in malignant than in benign strictures (13.5 versus 9.6 mm; P=0.11, two-tailed t test).Where duct dilation was found, the stricture location could be determined on the ultrasound.Additional helpful ultrasonographic findings were frequently found in the malignant subgroup and included a pancreatic mass in nine of 15, of whom three patients also showed evidence of metastases to the liver or regional lymph nodes.Four patients were shown to have a biliary ductal mass, of whom two had evidence of nodal metastases.In one patient, only liver metastases were demonstrated.Only one patient with a malignant stricture had none of the above abnormalities.Ultrasound was, therefore, highly sensitive (93%) in the detection of malignant obstruction by visualizing the actual mass or metastases.Conversely, in benign disease, ultrasound was insensitive in diagnosing biliary obstruction with more than half of the cases lacking both intrahepatic and extrahepatic biliary dilation.Disida scan: Iminodiacetic acid imaging of the hepatobiliary tree was obtained in 26 of the 29 patients.Three patients were not studied due to the need for urgent biliary decompression.All 12 Disida scans obtained in patients with malignant strictures were abnormal; two showed hepatocellular dysfunction alone (poor uptake of radionuclide from blood) and 10 showed evidence of cholestasis or obstruction.Of the 14 Disida scans from patients with benign strictures, seven were normal or minimally abnormal (not diagnostic), six showed evidence of cholestasis or obstruction and one showed hepatocellular dysfunction.The overall sensitivity of Disida for the diagnosis of obstruction was 50%.
The differentiation between intrahepatic cholestasis and extrahepatic obstruction was generally not possible, except in cases with partial filling of a dilated proximal duct or where regional delay of hepatic excretion suggested partial obstruction of the right or left hepatic duct.The specificity of Disida for the diagnosis of bile duct obstruction is, therefore, poor but could not be calculated from this study.Deconvolutional analysis for calculation of the hepatic extraction fraction (a measure of hepatocellular function) and biliary excretion half-life yielded inconsistent results and failed to improve the diagnostic accuracy of Disida scanning (data not shown).Cholangiography: A cholangiogram was obtained in all 29 patients; 25 had successful ERCPs, four had PTCs due to surgically altered anatomy or failed ERCP.Stricture location and subclassification according to benign or malignant cause is summarized in Table 3. Strictures involving the common bile duct were more likely to be malignant than those of the papilla or intrahepatic ducts.A classic double duct sign (abrupt cutoff of both the pancreatic and bile ducts) was seen in only three of 10 patients with cancer of the pancreas who had their cholangiogram by ERCP.Of the remaining patients, only a single patient had a double duct sign due to benign strictures involving both ducts.
There was a significant difference in stricture length be- tween the malignant and benign groups -30.3 versus 9.2 mm (P=0.001).Threshold analysis using ROC curves showed that a stricture length 14 mm or greater was predictive of malignancy (sensitivity 98%, specificity 77%).Cholangiographic characterization of each stricture as smooth ('radiologically benign') or irregular ('radiologically malignant') was possible in 26 cases.There was a similar number of smooth (eight of 13) and irregular (five of 13) type strictures in the malignant group; 10 of 13 benign strictures were smooth.Stricture width was difficult to measure accurately and did not discriminate malignant from benign strictures.Furthermore, a ratio of stricture width to proximal dilated duct width (as a cholangiographic measure of obstruction) was not helpful (data not shown).

DISCUSSION
We have reported a prospective study of all patients with newly diagnosed biliary strictures from a single institution over 12 months.Although statistically significant differences were found in most biochemical parameters, ALT and bilirubin best discriminated between malignant and benign strictures.In this series, a serum bilirubin level of 75 µmol/L or greater was highly predictive of a malignant etiology for the stricture (Table 2).This suggests that a more critical narrowing exists in the malignant strictures.Furthermore, the stricture length on cholangiography was significantly greater in malignant versus benign strictures (30.3 versus 9.2 mm, P=0.001), suggesting that malignant strictures are more extensive at presentation.
There were important differences in the ultrasound characteristics of patients with malignant versus benign strictures.Malignant strictures were more likely than benign ones to induce intrahepatic duct dilation (93% versus 36%, P=0.002).The degree of obstruction was not assessed in prior ultrasound series as a discriminator between malignant and benign etiologies (7,8).Furthermore, sonographic evidence suggestive of malignancy (visualization of primary tumour or metastases) was present in 14 of 15 patients.The abdominal ultrasound, therefore, plays an important role both in the estimation of stricture severity as judged by the presence or absence of intrahepatic duct dilation and in the assessment of malignant etiology because tumour can usually be directly visualized.It is relatively insensitive, however, for the diagnosis of benign strictures, many of which are associated with only low grade obstruction.We did not assess the role of CT scanning in this study because ultrasound performs as well and is less expensive (3,4).Helical CT may have a future role, but further study is required (9).Disida nuclear scans are frequently performed in the investigation of a wide variety of biliary tract diseases.In this series, they were insensitive for the diagnosis of benign strictures, especially those with low grade obstruction.High grade obstruction was readily detected but the anatomic site could not be localized in most cases.High grade extrahepatic biliary obstruction appears to induce a functional intrahepatic cholestasis, which was the most common finding in this series.This is rapidly reversed with endoscopic stenting of the stricture.Brown and colleagues (10) reported the use of deconvolutional analysis of Disida time-activity curves to distinguish between primarily hepatocellular and primarily biliary tract disease.The former group had a reduced hepatic extraction fraction, whereas it was preserved in the latter.Both groups had impaired biliary excretion.Further refinements in this technique may be of clinical use, but we could not define a useful clinical role.
The performance of a cholangiogram remains the gold standard for stricture diagnosis.This is particularly true for benign strictures because 64% lacked hepatic duct dilation and, therefore, required cholangiographic diagnosis of the stricture that had been suspected on the basis of clinical findings and serum biochemistry.These findings are in agreement with those of a study reported from Finland (11).Despite a high sensitivity for the diagnosis of strictures, the cholangiogram was a poor discriminator between malignant and benign strictures.Benign strictures were usually of a smooth character (77%); however, malignant ones were as well (62%), and three strictures in the series could not be characterized due to nonvisualization as a result of their very narrow calibre.
A cholangiogram is an essential component of the investigation of any stricture because it provides anatomic details such as stricture location and calibre.This will be challenged by the advent of MRCP, which promises to provide noninvasively acquired images of the biliary tree.Recent publications suggest that its diagnostic accuracy approaches that of ERCP (12,13).ERCP also provides the opportunity for therapeutic intervention for biliary strictures.MRCP was not formally evaluated in this protocol because our institution was not performing them at the time that this study was initiated.Endoscopic ultrasound is another promising technology that can provide information not only about the nature and location of the stricture, but also in regard to regional spread (14).It will be limited by the special expertise required for its application and is unlikely to be widely available in the near future.
This prospective study of biliary strictures shows that serum bilirubin is an important discriminator between malignant and benign strictures and that ultrasound is particularly useful in the diagnosis of malignant strictures, but often misses benign strictures.ERCP enables accurate anatomic diagnosis and provides the opportunity for etiological diagnosis with cytology brushings and for stenting to relieve obstruction in selected cases.Disida scanning added little additional information.
Prospective study of biliary strictures MRCP and endoscopic ultrasound will assume an increasingly important diagnostic role in the future.It is of critical importance that we continue to evaluate these new technologies prospectively.Only then will they be most efficiently applied for patients with biliary obstruction.

TABLE 2 Threshold values and likelihood ratios for malignancy Parameter threshold Estimated threshold Threshold TPR Threshold FPR Likelihood ratio
400Can J Gastroenterol Vol 14 No 5 May 2000 Bain et al Figure 1) Comparison of serum biochemistry from patients with malignant versus benign strictures.Alanine transaminase (ALT), aspartate transaminase (AST) and alkaline phosphatase (AP) are expressed as U/L, and bilirubin (BR) and bile acids (BA) as µmol/L.Each of these values was significantly higher in the malignant than in the benign group: ALT 235.2 versus 66.9, P=0.004; AST 189.8 versus 84.5, P=0.011; AP 840.2 versus 361.1, P=0.002; BR 317.8 versus 22.1, P<0.001; BA 242.5 versus 73.2, P=0.001