Feasibility of Comparing the Results of Pancreatic Resections between Surgeons: A Systematic Review and Meta-Analysis of Pancreatic Resections

Background. Indicators of operative outcomes could be used to identify underperforming surgeons for support and training. The feasibility of identifying HPB surgeons with poor operative performance (“outliers”) based on the results of pancreatic resections is not known. Methods. A systematic review of Medline, Embase, and the Cochrane library was performed to identify studies on pancreatic resection including at least 100 patients and published between 2004 and 2014. Proportions that lay outside the upper 95% and 99.8% confidence intervals based on results of the systematic reviews were considered as “outliers.” Results. In total, 30 studies reporting on 10712 patients were eligible for inclusion in this review. The average short-term mortality after pancreatic resections was 3.1% and proportion of patients with procedure-related complications was 47.0%. None of the classification systems assessed the long-term impact of the complications on patients. The surgeon-specific mortality should be 5 times the average mortality before he or she can be identified as an outlier with 0.1% false positive rate if he or she performs 50 surgeries a year. Conclusions. A valid risk prognostic model and a classification system of surgical complications are necessary before meaningful comparisons of the operative performance between pancreatic surgeons can be made.


Background
Indicators of operative outcomes could be used to identify underperforming surgeons for support and training. Pancreatic resection is one of the most common major operative procedures performed by Hepato-Pancreato-Biliary (HPB) surgeons. As the procedure is complex with a high associated morbidity and mortality, it may be suitable for comparing the operative performances of HPB surgeons. The major indication for pancreatic resection is pancreatic cancer, the seventh most common cause of cancer-related mortality in the world, resulting in approximately 330 000 deaths worldwide annually [1]. Pancreatic cancer is a biologically aggressive cancer, which is relatively resistant to chemotherapy and radiotherapy and has a high rate of local and systemic recurrence [2][3][4]. In early pancreatic cancer (with no invasion of adjacent structures such as the superior mesenteric vein, portal vein, or superior mesenteric artery or distal metastases), surgical resection is generally considered the only treatment with the potential for long-term survival and possibility of cure in people likely to withstand major surgery. The overall fiveyear survival after radical resection ranges from 7% to 25% [4][5][6][7][8][9], with a median survival of 11 to 15 months [10]. With adjuvant chemotherapy, median survival after radical resection is increased and ranges between 14 and 24 months [11]. However, it should be noted that about half of the patients presenting with pancreatic cancer will have metastatic disease and one-third have locally advanced unresectable disease, leaving about 10% to 20% suitable for resection [12].

HPB Surgery
Another major indication for pancreatic resection is chronic pancreatitis, a condition associated with long-standing and progressive inflammation of the pancreas resulting in destruction and replacement of pancreatic tissue with fibrous tissue leading to exocrine pancreatic insufficiency and endocrine pancreatic insufficiency (diabetes) [13]. The annual incidence of chronic pancreatitis ranges from 1.5 to 7.9 per population of 100,000 [14][15][16][17][18]. The prevalence of chronic pancreatitis ranges from 17 to 49 per population of 100,000 [15,16,18]. The annual mortality rate attributable to chronic pancreatitis is around 1 to 4 per million people [15,17]. There is no consensus among experts for selecting patients for surgical management but pancreatic pain and other local complications are the major indications for surgical treatment [19]. Other indications for pancreatic resection include ampullary cancers, distal common bile duct cancers, duodenal cancers, intraductal papillary mucinous neoplasm, and neuroendocrine tumours [20][21][22][23].
Pancreatic resection is in the form of pancreaticoduodenectomy for cancers of the head of the pancreas, ampullary cancers, distal common bile duct cancers, and duodenal cancers and distal pancreatectomy for cancers of the body and tail of the pancreas [24]. Pancreaticoduodenectomy involves excision of the head of the pancreas and duodenum. The two major types are the classical Whipple operation and the pylorus preserving pancreatoduodenectomy [25]. Surgical excision for chronic pancreatitis can be performed by pancreaticoduodenectomy (standard Whipple's operation or pylorus preserving pancreaticoduodenectomy) or by duodenum preserving pancreatic head resection [19,26]. Duodenum preserving pancreatic head resection involves resection of the pancreatic head without excision of duodenum. The two major types are Beger's operation and Frey's procedure [26]. The latter involves a drainage procedure anastomosing the duct in the pancreatic remnant to the jejunum by longitudinal pancreatojejunostomy in addition to pancreatic head excision leaving behind a cuff of pancreas on the duodenal wall [26].
In general, pancreaticoduodenectomy is performed by open surgery, although laparoscopic pancreaticoduodenectomy has been reported [27]. Laparoscopic pancreatic resection is more common for distal pancreatectomy [28]. After resection of the body and tail of the pancreas, the cut surface of the pancreatic remnant (pancreatic stump) is closed using staples or sutures [29].
In England, individual surgeon's results of surgery-related complications are being published as part of the drive by NHS England to improve transparency [30,31] and to allow patients to make informed decisions. This allows patients to identify outliers (a consultant whose clinical outcomes data lies outside the expected range given the national average) [31] and make an informed decision as to whether they would like to be treated by that particular surgeon. The feasibility of identifying HPB surgeons with poor operative performance based on the results of pancreatic resections is not known but it may be the most suited for comparison as the procedures are common and complex and are generally associated with a high morbidity and mortality. The main objectives of this research are to conduct a systematic review of the recent results of pancreatic resections so that it is possible to establish a benchmark for surgeon's performance based on international standards and to assess the feasibility of comparing the results of pancreatic resections between surgeons based on the results of the systematic review.

Selection of Studies.
All studies that reported on pancreatic resections irrespective of whether they were pancreaticoduodenectomies or distal pancreatectomy, the reason for the pancreatic resection (cancer or benign disease), the type of access (open or laparoscopic), the type of anastomosis (pancreaticogastrostomy or pancreatojejunostomy), and the postoperative care provided to the patients were included. Only studies including at least 100 patients, published as full texts or conference abstracts in the previous 10 years from the search date (February 2014), and reporting one or more of the primary outcomes (30-day or in-hospital mortality) or secondary outcomes (12-month mortality, proportion of people with complications, number of complications, the classification system used to report the complications, operating time, and length of hospital stay) were included in the review to ensure that only the recent results on a reasonable number of patients were included in the analysis. Studies were identified by searching Medline, Embase, and the Cochrane library using the Medical Subject Headings (MeSH) search terms "pancreatectomy," "pancreaticoduodenectomy," and "pancreaticojejunostomy." Equivalent free text search terms were used and equivalent search strategies were used in other databases. The search strategies are available in the Appendix. No language restrictions were applied.
Two authors (Clare Toon and Bhavisha Virendrakumar) independently screened titles and abstracts. Full texts were obtained for references that at least one author identified as potentially meeting the inclusion criteria. Further selection was made independently by two authors (Clare Toon and Bhavisha Virendrakumar) by reviewing the full texts. All differences were resolved by discussion and arbitration by another author (Kurinchi Gurusamy).

Data Collection.
Data on patient characteristics including the demographic details, case-mix (risk prognostic models or score to take into account the different anaesthetic and surgical risks in patients), and outcomes were extracted by two authors (Clare Toon and Bhavisha Virendrakumar) independently. Foreign language articles were translated to English before data extraction. When significant overlap of patients between two or more reports was identified based on the authors, centres, and the time period, the report that contained maximum information with regard to the outcomes was included for the analysis. All differences in data extraction were resolved by discussion and arbitration by another author (Kurinchi Gurusamy).

Meta-Analysis.
Meta-analysis was performed using StatsDirect statistical software using a random-effects model. The summary estimates with 95% confidence intervals (CI) have been reported. Heterogeneity was assessed by Higgin's -square [32] and chi-square test for heterogeneity. Despite exploration of heterogeneity by various subgroup analyses including the reason for pancreatic resection (cancer versus other causes), type of resection (pancreaticoduodenectomy versus distal pancreatic resection), and method of access (laparoscopic versus open access), the data available from the studies were insufficient to allow meaningful subgroup analyses. Publication bias was assessed by funnel plot and Egger's regression test [33].

Assessment of Feasibility of Comparing the Operative
Performance. The short-term mortality and complications which would have been attributable to an individual surgeon for a hypothetical cohort of people undergoing pancreatic resection were calculated based on the summary estimate of the meta-analysis, the lower quartile, and the upper quartile of the proportions observed for these outcomes in the systematic review, thus extrapolating the results of the meta-analysis to an average surgeon. The 95% and 99.8% confidence intervals of these outcomes were calculated using the Wilson score method with continuity correction [34] for samples sizes of 50, 100, and 200 (approximately 1 pancreatic resection a week, 2 pancreatic resections a week, and 4 pancreatic resections a week). Proportions that lay outside the upper 95% and 99.8% confidence limits were considered as outliers with a one-sided false positive rate of 2.5% and 0.1%, respectively. The 95% and 99.8% confidence limits are equivalent to the surgeon having results which are different by two standard deviations and three standard deviations from the average results expected based on the data ("the benchmark"). Onesided false positive rate was calculated since the upper limit of the confidence interval was the main interest of the study; that is, if surgeon-specific mortality and complications were lower than the confidence limits, it was not of any interest since these surgeons are better than other surgeons and there are no concerns on their operative performance.

Search Results.
A total of 7193 references were identified by database search. After removing duplicate citations, a total of 6268 unique references were identified. Full text was sought for 41 references [20][21][22][23]. A total of 6 full texts were excluded (4 studies had less than 100 patients in total [66][67][68][69]; one reference was a comment on an excluded article [70]; and one study did not contain any outcomes included in this review [71]). Five references were duplicate reports of other studies or contained a significant proportion of patients included in other reports [61][62][63][64][65]. Data from these studies was not included in the analysis to avoid the same patients being counted multiple times. In total, 30 studies reporting on 10712 patients were eligible for inclusion in this review [20][21][22][23]. The reference flow is shown in Figure 1.

Characteristics of Included Studies.
The number of patients included in each study, the number and proportion of patients with malignancy, the mean age of patients, the number and proportion of female patients, and different groups within the cohort as reported by the study authors have been tabulated in Table 1. The number of patients included in each study varied from 100 to 2610 patients. Three studies included only patients with chronic pancreatitis [36,38,53]. One study included only patients with malignancy [37]. The remaining studies included various proportions of patients with malignancy. The mean age of patients reported in the studies ranged from 42 years to 68 years. Case-mix was assessed using surgical Apgar score (SAS) in one study [39]. None of the remaining studies reported any adjustment for case-mix. The surgical details of patients in terms of the surgeries and the surgical access included in the studies have been tabulated in Table 2. Only three studies included patients undergoing laparoscopic pancreatic resection [20,22,41]. Depending upon the proportion of short-term mortality used (the meta-analysis summary estimate, lower quartile, or upper quartile), sample size (50, 100, or 200), and the false positive rate (2.5% versus 0.1% for a surgeon to be wrongly identified as an outlier), a surgeon will be called an outlier only when the surgeon-specific mortality is several times the average mortality (Table 3). For example, the surgeon-specific mortality should be more than 5 times the average mortality before he or she can be identified as an outlier with 0.1% false positivity rate (i.e., results lie outside three standard deviations of the average results expected from a surgeon) if he or she performs 50 surgeries a year.

Complications.
Complications were reported variably in different studies. Five studies reported complications using the Clavien-Dindo method [72,73] of classification of complications [20,37,39,48,60]. One study used the Accordion severity grading system [74] of classification of complications [22]. Two studies used "common terminology criteria for adverse events" system [75] of classification of complications [50,57]. The remaining studies did not use any specific system of classification of complications. The proportion of people with complications was reported in 23 studies including 6712 patients [20, 21, 23, 35-41, 43, 45-50, 52, 54-57, 60]. The proportion of people with complications ranged between 3.3% and 100.0% (lower quartile = 38.3%; upper quartile = 53.4%). The proportions of people with complications in individual studies are shown in Figure 4. The average proportion of people with complications was 47.0% (95% CI 36.0% to 59.0%; 2 = 98.9%). There was significant publication bias as denoted by Egger's regression test ( = 0.0037) with the funnel plot suggesting that studies with lower complication proportions were more likely to be published.
With regard to comparing the performance of surgeons, a surgeon will be identified as an outlier with 0.1% false positive rate when the proportion of patients who develop complications following surgery by him or her is 1.4 times that of the average even if he or she performs 50 surgeries a year as shown in Table 3.

Discussion
In this systematic review and meta-analysis, the recent results of pancreatic resections have been reviewed. Despite significant advances in anaesthetic and surgical techniques in the recent years, pancreatic resection remains a major surgery with significant risk of complications and mortality. The average 30-day or in-hospital mortality after pancreatic resection was approximately 3% (Figure 2) and approximately 47% of patients undergoing pancreatic resection develop one or more complications (Figure 4). However, there was significant variation in the mortality and the complications as evidenced by the 2 values which demonstrated substantial statistical heterogeneity. The average 12-month mortality was 2.2% ( Figure 3) which was less than the average 30-day mortality of 3.1%. This is not clinically possible but was observed in this systematic review because of different studies being included for the different time points. This is further evidence of heterogeneity in mortality between the studies. One possible reason for this observed heterogeneity in the results is the inclusion of different types of surgeries in different studies. Another possible reason is that the patients included in the different studies had different comorbidities and there was variation in the technical difficulty of surgery ("case-mix"). Use of prognostic models is one of the commonly used methods to adjust for case-mix. A number of prognostic models are available for risk adjustment in pancreatic resections [76], although the accuracy of these models has not been assessed systematically. Only one of the studies included in this review considered a risk prognostic model to adjust for case-mix [39]. While the authors used surgical Apgar score as the risk prognostic model, it was not reliable in this study [39]. Prognostic models to adjust for case-mix are essential for comparative audit between specialists to ensure that surgeons are not penalised for accepting to operate on high-risk patients where there is evidence of potential patient benefit. In addition, adequate adjustment for the case-mix is necessary to allow indirect comparison of results obtained in different studies. Thus, a reliable method of adjustment for casemix (risk prognostic model) is necessary for pancreatic resections.
With regard to complications, in addition to the types of pancreatic resections and case-mix contributing to the heterogeneity in the estimates obtained in the different studies, another reason for heterogeneity is the different methods of classifying complications.
While the mortality rate of 3% is a high perioperative mortality rate, it does not allow comparison of the surgical performance as the surgeon-specific mortality has to be more than 5 times the average mortality before the surgeon is identified as an outlier with 0.1% false positive rate if he or she performs 50 pancreatic resections a year (Table 3). Fifty pancreatic resections per year equates to an average of one resection a week and few surgeons are likely to perform more than this. Thus, using short-term mortality does not appear to be a sensitive way of comparing the performance of HPB or pancreatic surgeons. If, on the other hand, complication rates were used to compare surgeons, an outlier will be identified if the proportion of patients who develop complications following surgery by him or her is only 1.4 times the average complication rates (with 0.1% false positive rate if he or she performs 50 pancreatic resections a year). An evaluation of complication rates following pancreatic resection would therefore 8 HPB Surgery  allow the comparison of operative performance of surgeons with a reasonable sensitivity. However, the major problem with using complications as the benchmark for assessing surgeons is that they will also depend on the case-mix of the patient cohort. In addition, none of the current classification systems for complications adequately distinguish between complications that result in permanent disability as opposed to those that do not result in permanent disability. While some of these systems include reinterventions and requirement for organ support while classifying complications [72][73][74], the cost implications of these individual complications to the healthcare funder are not clear. Thus, the existing systems of classification of complications which have been applied to major pancreatic surgery do not appear to be patient outcome oriented or funder oriented and cannot therefore be used as benchmark for assessing surgical performance.
Health-related quality of life (HRQoL) using a validated quality of life scale may be a suitable way of comparing the long-term outcomes of surgeons but is not sensitive enough to capture the severity of the early postoperative complications. This is because the HRQoL is usually impaired immediately after major surgery and hence those developing major complications shortly after surgery may not have a significant change from the baseline (observed in people without complications) because of the low baseline values. In addition, measurement of long-term HRQoL may necessitate additional follow-up for patients resulting in additional resource utilisation and costs. The likelihood of missing data will increase if long-term follow-up is necessary to assess the outcomes. Current methods which have been suggested for identifying surgeons with poor operative performance are likely to miss a significant proportion of underperforming pancreatic surgeons. The results of this review are applicable to other surgeries that have similar or lower mortality such as liver resections and colorectal surgeries. Valid risk prognostic model and classification system of surgical complications (which captures long-term disability to patients and the cost implications to funder) are necessary before meaningful comparisons of the operative performance between pancreatic surgeons can be made.