Outcomes of pediatric laparoscopic fundoplication : A critical review of the literature

Division of Pediatric General and Thoracic Surgery, The Montreal Children’s Hospital; McGill University Health Centre, Montreal, Quebec Correspondence: Dr Sherif Emil, Division of Pediatric General and Thoracic Surgery, Montreal Children’s Hospital, 2300 Tupper Street, Room C-818, Montreal, Quebec H3H 1P3. Telephone 514-412-4497, fax 514-412-4289, e-mail sherif.emil@mcgill.ca Received for publication July 16, 2013. Accepted September 19, 2013 The treatment of gastroesophageal reflux disease (GERD) in infants and children presents an ongoing challenge to physicians and surgeons. A step-up approach is recommended beginning with conservative therapies and progressing to acid suppression if symptoms persist (1,2). Laparoscopic fundoplication (LF) is considered when medical treatments have failed (1,2). However, the definition of failure is largely subjective and clinician dependent. Currently, there are no studies comparing medical therapies with LF for the treatment of pediatric GERD. While complications of LF are well documented, the indications are often vague and objective documentation of refractory reflux is often missing (2-4). Despite this, LF remains one of the most common operations performed by pediatric surgeons in the United States. Published reviews of LF have simply analyzed the data from individual studies without assessment of their quality (5-7). Position papers and treatment guidelines have been formulated by national and international surgical organizations based on the same data, typically without comment on the level of the evidence (6,8). To fill this gap, we conducted a critical review of all articles published from 1996 to 2010 that reported the mediumto long-term outcomes of LF. Our primary goal was to assess the quality of the literature. Our secondary goal was to evaluate reported outcomes.

A single investigator reviewed all retrieved citations' titles and abstracts to eliminate those that clearly did not meet inclusion and exclusion criteria.Two independent reviewers reviewed the remaining articles in their entirety for final selection based on the standardized eligibility criteria.Any disagreements in the selection process were resolved through discussion between reviewers.The references of the selected articles were manually searched to identify any additional relevant articles.
The articles were reviewed in detail for quality assessment and data extraction.Data were then pooled and discrepancies dealt with through re-review until consensus was reached.Article quality was determined by study design, a standardized quality assessment form based on the Newcastle-Ottawa quality assessment scale and the Cochrane risk-of-bias assessment tool.Data were extracted using a standardized data extraction form.Extracted data included baseline demographics, diagnostic criteria, study intervention, follow-up details and outcomes, including postoperative mortality, GERD recurrence and need for reoperation.
Quality assessments are reported qualitatively with special attention devoted to study design, confounders, biases and length of followup reporting.Outcomes are reported as proportions with 95% CIs.Pooled estimates are presented where appropriate, based on χ 2 testing for heterogeneity.

results
The literature search retrieved a total of 5302 articles, 36 of which met inclusion and exclusion criteria (Figure 1).A summary of the selected articles is presented in Table 1.

Quality assessment
The retrieved studies all constituted low-level evidence: five prospective comparative studies (level 2b) (9-13), four retrospective comparative studies (level 3b) (14-17) and 27 case series (level 4) .Three of the prospective studies compared LF in neurologically impaired versus neurologically intact patients.There were no randomized controlled trials of LF versus medical therapy.All nonrandomized comparative studies compared LF with open fundoplication or minor technical modifications in the laparoscopic technique; none compared LF with medical therapy.There were three before-and-after studies (11)(12)(13).However, none of these studies clearly described the treatments, if any, that patients received before LF.Thus, the preoperative disease status was largely unknown, making the outcomes difficult to interpret (11)(12)(13).
Table 2 outlines the quality features of the selected articles.Only five studies attempted to control for confounders in their analysis.The factors controlled for varied among studies and included age, sex, comorbidities, respiratory status and surgical technique.The percentage of neurological impairment was reported in 78% of studies; however, the majority did not control for this factor in their analysis.Similarly, the rates of esophageal atresia were reported in 22% of studies, none of which controlled for its presence during analysis.No studies controlled for the confounding effects of the learning curve associated with LF.
Bias was present throughout, with only two studies clearly attempting to reduce bias (10,33).The first was a prospective study involving a cohort of institutionalized patients (10).To reduce detection bias, this study used universal pH testing at 12 months postoperatively to detect recurrence and had no patients lost to follow-up.However, the authors did not describe how patients were assigned to LF, raising the possibility of a significant selection bias (10).The second study (33) was a case series that used clear selection criteria for LF, routine pH testing at 12 months to detect recurrence and had no loss to follow-up.
The remaining studies were prone to selection, detection, reporting and attrition bias.With regard to selection bias, the studies failed to adequately describe why patients were selected for LF over other surgical or medical treatments.They also failed to describe the background population from which the reported samples were derived.Detection bias was prevalent due to the lack of a definition of recurrence, which was only present in 17% of studies, as well as the reliance on patientand caregiver-reported symptoms to trigger further investigations.Only two studies used a standardized follow-up interview with blinded assessors (12,13).Reporting bias was prevalent because studies failed to indicate which outcomes were being investigated a priori.Furthermore, studies failed to report the absence of many common complications, making it unclear whether these did not occur, were not investigated or were omitted.Finally, attrition bias was a common problem because patients were lost to follow-up over time, particularly in studies reporting extended follow-up periods.All studies failed to indicate the features of participants lost to follow-up compared with those remaining in the sample.Follow-up protocols were rarely reported.

Figure 1) Article selection outcomes
Mortality was reported by 58% of the studies.All mortalities were attributed to progression of the patients' underlying cardiac, neurological or respiratory disorders.Figure 2 illustrates the reported mortality rates with 95% CIs.The mean (± SD) pooled mortality rate in neurologically impaired children was found to be 17.9±4.9%(χ 2 for heterogeneity P=0.549).
Recurrence rates following LF were reported by 83% of studies.However, only six studies used an explicit definition of recurrence.The reported recurrence rates ranged between 0% and 48±19.6%(Figure 3).Significant heterogeneity existed among the studies, preventing pooling of data (χ 2 for heterogeneity P<0.001).
The need for reoperation following LF was reported in 50% of the studies.A consistent definition of indications for reoperation was absent.Some studies reported the percentage of re-do fundoplications due to recurrent GERD, while others included reoperation for wrap stenosis, pyloroplasty and recurrent hiatal hernia.The rate of reoperation was reported to be between 0.69±0.95%and 17.7±8.4%(Figure 4).Studies reporting neurologically impaired and neurologically normal populations were sufficiently homogenous to allow pooling (χ 2 for heterogeneity P=0.16 and P=0.528, respectively).The pooled estimate for reoperation in neurologically impaired patients was 15.4±4.2%.The pooled estimate for reoperation in neurologically normal patients was 7.0±3.3%.The neurologically impaired group underwent significantly more reoperations than the neurologically normal group (χ 2 P=0.003).

disCussion
LF is the current standard surgical treatment for refractory pediatric GERD.However, little is known about the long-term outcomes of this procedure and its true effectiveness.The available literature is of extremely poor quality according to evidence-based standards.Presently, there are no randomized trials or prospective cohort studies comparing LF with medical therapy.Thus, there is no conclusive evidence that surgery is superior to medical therapy (4).
In our review, 75% of studies were case series, which are known to favour the described intervention.They lead to false inferences up to 50% of the time (45).Multiple innovative treatments, once believed to be effective based on case series, were subsequently found to be no better than standard treatments when rigorously studied (45).Although nine of the included studies were comparative, none were randomized and three used historical controls.The use of historical controls is often confounded by changes in management, separate from the intervention in question, leading to false inferences in 40% to 60% of cases (45).
Studies frequently failed to adequately describe their study populations, diagnostic criteria, follow-up protocols and outcome measures.The majority of studies indicated the proportion of the study population that was neurologically impaired.However, most failed to analyze this population separately, despite its association with worse outcomes (46).We found a statistically significant difference in reoperation rates between these two groups in our study.Similar concerns exist over studies including patients with esophageal atresia (3).Mixed populations were reported by 57% of reviewed studies.The failure to report and control for underlying comorbidities makes it difficult to apply results to a given patient population (3).
There was also a consistent failure to explicitly outline the diagnostic criteria of GERD, the selection criteria for intervention and the outcome measures used.The lack of criteria for diagnosis and treatment likely relates to the lack of a standard definition of GERD in the pediatric population.An international panel of experts created guidelines in 2009 (1); however, most studies were published before 2009.Furthermore, these guidelines have yet to be universally adopted; therefore, recent studies fail to apply them (1,47).Comparing studies was further hindered by the lack of clear, a priori outcomes and standardized follow-up protocols.Many have called for the use of standardized outcome measures when reporting the efficacy of GERD treatments (3,(47)(48)(49).However, as noted by Gold et al (49), validated outcome measures do not currently exist.Finally, patient-and parentreported outcomes were heavily used in this body of literature, increasing the risk of a placebo effect, especially given the subjective nature of many of the reported outcomes.This is compounded by a potential second placebo response due to the natural history of GERD in children, the efficacy of nonpharmacological methods and expectation bias (50).All of these factors likely overestimate the true effectiveness of LF.
The biases present in the analyzed studies further limit the conclusions that can be drawn from them.Selection bias was present because clear inclusion and exclusion criteria were lacking from the majority of the studies.Thus, surgeons were likely to include patients they believed would benefit most from LF, and exclude those considered to be at increased risk of complications.The lack of clear a priori followup protocols means that many complications were likely not detected, not reported or both.Significant attrition bias existed due to the failure to account for patients lost to follow-up.The proportion lost to follow-up, as well as the reasons for that loss, was not reported by any of the reviewed studies.
Recurrence rates varied widely.The largest study of open fundoplication, a multicentre retrospective series of 7467 cases (51), reported a recurrence rate of 7%.However, this was probably a gross underestimation because the study was poorly designed to detect recurrence.Objective testing for GERD using esophagogastroduodenoscopy or pH probe was only performed in 54% and 26%, respectively, and the results of this objective testing were not reported (51).The study presented no standard follow-up, no standard assessment of outcomes and no quality of life measures (51).The wide range of recurrences in the current review are likely due to variations in the definition of recurrence and generally poor follow-up data.Studies with very high recurrence rates may have erroneously implicated GERD as the reason for the patient's symptoms (5,52).For example, a recurrence rate of 48±19.6% was found in a population with respiratory disease attributed to GERD (40).The link between respiratory disease and GERD is not firmly established, and several studies have shown no benefit to LF when performed for respiratory indications (48,53,54).Reoperation was required in all 18 studies reporting this outcome.The indications for reoperation varied among studies and included recurrence, wrap failure, esophageal stenosis, recurrent hiatal hernia and pyloroplasty for postoperative delayed gastric emptying.The pooled estimate of reoperation rate in neurologically impaired patients was 15.4±4.2%versus 7.0±3.3% in neurologically normal patients.Neurologically impaired patients also had the highest mortality rates, along with patients undergoing fundoplication before their first birthday (19,36).These differences support the notion that neurologically impaired patients experience worse outcomes after LF (55).LF should be considered a palliative procedure for many of these patients and the surgeon should openly share the outcome information to provide the family with realistic expectations.
Prospective studies comparing LF with medical therapies are needed.Although, a randomized controlled trial would be optimal, it is unlikely to occur due to the wide adoption of LF by surgeons and parents alike as a common treatment option for pediatric GERD.A well-designed multicentre prospective cohort study using matched controls is a more realistic option.Such a study should aim to provide long-term follow-up and provide subgroup analysis to determine the long-term effects of LF in different populations.In the meantime, adoption of a universal definition of GERD may allow more objective selection of patients for LF (47).The use of standard outcome measures, as suggested by Gold et al (49), can create uniformity in the literature, allowing meaningful comparisons among studies.The enthusiasm for LF in children should be tempered until higher-quality evidence is available to support its long-term efficacy.
suMMAry Pediatric LF is one of the most common operative procedures performed in children.Multiple case series and several reviews of pediatric LF have been published over the past 15 years.However, none of the reviews critically evaluated the quality of the literature from an evidence-based perspective.We performed a critical review of the literature on pediatric LF outcomes over a 15-year period.Our results indicate that the level and quality of the evidence supporting LF are extremely poor.Higher-quality data are required before the procedure can be considered to be an effective intervention for the treatment of pediatric GERD.