Prediction scores do not correlate with clinically adjudicated categories of pulmonary embolism in critically ill patients

1Department of Medicine, McMaster University, Hamilton, Ontario; 2Department of Clinical and Experimental Medicine, University of Insubria, Varese, Italy; 3Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton; 4Interdepartmental Division of Critical Care, University of Toronto, Toronto, Ontario; 5Department of Medicine, Dalhousie University, Halifax, Nova Scotia; 6Department of Medicine, University of Alberta, Edmonton, Alberta; 7Department of Medicine, University of British Columbia, Vancouver, British Columbia; 8Department of Medicine, University of Ottawa, Ottawa, Ontario; 9Mayo Clinic, Rochester, Minnesota, USA; 10Department of Medicine, Queens’s University, Kingston, Ontario; 11Department of Medicine, Universite de Montreal, Montreal, Quebec; 12Department of Critical Care, St Louis University, St Louis, Missouri; 13Department of Medicine, Brown University, Providence, Rhode Island; 14Department of Critical Care, Division of Anesthesiology and Critical Care, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA; 15Department of Critical Care, Flinders University, Camperdown, New South Wales, Australia; 16Department of Medicine, University of Calgary, Calgary, Alberta Correspondence: Dr DJ Cook, Departments of Medicine, Clinical Epidemiology & Biostatistics, McMaster University, 1280 Main Street West, Room 2C11, Hamilton, Ontario L8S 4K1. Telephone 905-525-9140 ext 22900, e-mail debcook@mcmaster.ca Pulmonary embolism (PE) is a common complication in critical illness (1), with a mortality rate of up to 25% (2). Although PE has potentially serious consequences, it is often unrecognized in critically ill patients. Left undiagnosed, PE in critically ill patients who have impaired cardiopulmonary reserve may experience catastrophic consequences (3). In a 25-year longitudinal study, 9% of hospital patients had PE at autopsy and, in 84% of these, the diagnosis was missed before death (4,5). Even in critically ill patients, PE remains one of the most common unsuspected autopsy findings (6). PE is particularly difficult to diagnose in critically ill patients. Diagnosis requires a high index of clinical suspicion (7,8) because critically ill patients are usually unable to communicate their symptoms due to their underlying condition, pharmacotherapy and mechanical ventilation. In addition, signs and symptoms, such as dyspnea, original article

tachycardia, hypoxemia and hypotension, which are suggestive of PE in nonintensive care unit (ICU) settings, are considerably more common in the ICU setting and attributable to many other factors.Tests that may be suggestive of physiological alterations compatible with PE (eg, decreased oxygen saturation, increased plasma troponin concentration) are often nonspecifically abnormal in critically ill patients.
Clinical decision rules (9)(10)(11)(12)(13) are used in medicine to provide pretest probabilities and guide decision making.Due to the silent nature of some PEs (14), simple, objective diagnostic scoring systems could be helpful in diagnosing PE.These prediction scores are often detailed in the chart or used in conversation with the ICU team because they are the only scores developed for PE.These scores have utility in patient populations in which they were developed and validated, in addition to other patient groups.Although these PE scores have been developed and tested in the emergency department, they have not been validated in the ICU setting.We aimed to establish whether these PE scores have discriminative power in the critically ill population.The objective of the present study was to evaluate whether two diagnostic PE scores -the Geneva and Wells scores -were useful in distinguishing critically ill patients who had possible, probable or definite PE according to clinical adjudication.

MethoDs
The present preplanned study was conducted using the database from a recent international trial (Prophylaxis of Thromboembolism in Critical Care [PROTECT] ; Clinicaltrials.govnumber: NCT00182143) that compared the low-molecular weight heparin (LMWH) dalteparin, and unfractionated heparin (UFH) for thromboprophyaxis in 3746 medical-surgical ICU patients (15).The study was conceived as a project under the 'PE-METRICS' program, which was designed to use the infrastructure of the PROTECT trial to understand the methodology, epidemiology and treatment of PE in critically ill patients.The PE-METRICS grant was submitted while PROTECT was enrolling patients to conduct work related to PE after the main publication.This was a peer-review, funded research program.Ethics approval was obtained as part of the PROTECT publication.
In PROTECT, patients were routinely screened with twice-weekly compression ultrasound for proximal leg deep vein thrombosis (DVT).However, PE detection did not involve screening.Patients who developed suspected PE were investigated and managed by the local ICU team using a predetermined diagnostic algorithm.First, these patients underwent bilateral leg ultrasound.Then, chest computed tomography pulmonary angiogram was performed in the 70 patients who did not have contraindications to this procedure.Thereafter, four members of a central adjudication committee (DC, MM, SM and RH) each independently adjudicated, using trial forms and the patient's chart, all cases of suspected PE.Adjudicators resolved disagreements by consensus.PE events were adjudicated as possible, probable or definite, and are defined in Table 1.Definite PE was defined by a clearly positive test (such as characteristic intraluminal filling defect on chest computed tomography or high-probability ventilation-perfusion scan).Probable PE was defined by a high clinical suspicion (moderate or high pretest probability) and either a nondiagnostic test for PE or no test for PE.Possible PE was defined as low clinical suspicion (low pretest probability) and a nondiagnostic test for PE.'No test for PE' is not part of the definition of a possible PE because the clinical concern had to be sufficient to order a test unless the patient was moribund or preterminal.'No PE' was defined as either no test for PE or a clearly negative test for PE.A nondiagnostic test was defined as an inconclusive test for PE and did not include negative tests (15).In the present study, patients who had clinically suspected prevalent PE (diagnosed within 72 h of ICU admission) or incident PE (diagnosed >72 h following ICU admission) were included.

geneva and Wells scores
Appendix Tables 1 to 4 summarize the four scoring systems (9-13) used in the present study.The maximum Geneva score of 16 corresponds to a pretest probability of 81%; a score of ≤4 corresponds to a pretest probability of 10.3% (9,11).The Wells PE Diagnostic score has a maximum score of 12.5, which corresponds to a pretest probability of 40.6%; a score of ≤4 corresponds to a pretest probability of <7.8% (10).The Modified Wells PE Diagnostic score (13) has a maximum score of 9; a score of ≤4 corresponds to a pretest probability of 6% and a score of >4 corresponds to a pretest probability of 78%.The Simplified Wells PE Diagnostic score (12) has a maximum score of 7, which corresponds to a pretest probability of 62%; the minimum score of 1 corresponds to a pretest probability of 12%.

Pilot exercise
A pilot exercise was conducted to examine and optimize interobserver rater agreement in preparation for the full study.In duplicate and independently, two research personnel (CK and MD) retrospectively abstracted data relevant to the scoring systems from medical records in a computerized ICU clinical information system (CareVue, Philips Inc, USA), written clinical notes, laboratory and other test results from the day that PE was suspected (within 12 h).Using pretested forms and an implementation manual, the two blinded raters abstracted data regarding signs and symptoms of PE, and data to calculate diagnostic score in a reliability and calibration exercise on one trial patient; subsequently, two blinded raters abstracted 18 items (six symptoms, eight signs, two tests and two scores) from the medical records of four trial patients.These individual variables were defined as per previous studies ( 16): six symptoms (dyspnea, pleuritic chest pain, substernal chest pain, cough, hemoptysis and syncope), eight signs (fever >38°C, tachypnea with respiratory rate >30 breaths/min, tachycardia [heart rate >100 beats/min], hypotension [systolic blood pressure <100 mmHg], central cyanosis, oxygen saturation <90%, physical signs of DVT such as calf pain, unilateral calf swelling or pain on flexion, and cardiopulmonary arrest) and two test results (arterial partial pressure of oxygen and echocardiographic findings of right heart strain).Data were also abstracted to calculate the Geneva Diagnostic score and the original Wells score.Chance-corrected agreement (using the original interpretation by Fleiss [17]) was calculated between two raters' measures of each dichotomous variable and each score.
Chance-corrected agreement values on the initial pilot exercise for symptoms were: dyspnea κ=0.82, pleuritic chest pain κ=0.97, substernal chest pain κ=0.90, cough κ=0.85, syncope κ=0.85; for signs: fever κ=0.71, tachypnea κ=0.78, tachycardia κ=0.60, hypotension κ=0.95, hemoptysis κ=0.85, cyanosis κ=0.59, desaturation κ=0.91, physical signs of DVT κ=0.87, cardiopulmonary arrest κ=0.88; and for test results: arterial blood gas PO 2 κ=0.70, echocardiographic signs of right heart strain κ=0.90.For PE diagnostic scores, agreement was: Geneva diagnostic score κ=0.53, original Wells score κ=0.71 (Table 2).Case report forms were distributed to research personnel in 67 centres that participated in the PROTECT trial (15).Blinded to study drug and PE adjudication status, research personnel (physicians or research coordinators) at each centre retrospectively abstracted data to calculate four scores: the Geneva score, original Wells score, Modified Wells score and Simplified Wells score.The research personnel who abstracted the data were asked to tabulate the points for each score based on interpreting the physicians' notes, nurses' notes, laboratory or other test values.The attempt was made to make assessments and assign numerical points as per clinical practice.

analysis
To examine the relationship among the three adjudicated categories of possible, probable and definite PE, and each of the low, intermediate and high pretest probability categories of PE on the Geneva and Wells scores, respectively, chance-corrected agreement was calculated using kappa and its original interpretation by Fleiss (17).ANOVA was used to examine the association between clinically adjudicated categories of PE and values for each of the four diagnostic scores.P<0.05 indicated that a diagnostic score was significantly different across three adjudicated categories of possible, probable and definite PE.

Results
There were 3746 patients included in the PROTECT trial.There were 70 patients in the final study including the five pilot patients, reflecting all 70 patients who were adjudicated for PE in the PROTECT trial; these patients were cared for in 30 of the 67 participating centres (Table 3).This yields an incidence of 1.9% (70 of 3746) of patients who were adjudicated for PE.Of the 70 PEs in the present study, 10 were prevalent and 60 were incident.Of 70 patients, four were adjudicated as 'possible PE', 16 as 'probable PE' and 50 as 'definite PE'.Agreement was poor between adjudicated categories of PE and both Geneva score pretest probabilities (κ=0.01 [95% CI −0.0643 to 0.0941]) and Wells score pretest probabilities (κ=−0.03[95% CI −0.1462 to 0.0914]).Across four patients who had possible, 16 patients who had probable and 50 patients who had definite PE, there were no significant differences in total Geneva scores (possible = 4.0, probable = 4.7, definite = 4.5; P=0.90), total original Wells scores (possible = 2.8, probable = 4.9, definite = 4.1; P=0.37), Modified Wells scores (possible = 2.0, probable = 3.4, definite = 2.9; P=0.34) or Simplified Wells scores (possible = 1.8, probable = 2.8, definite = 2.4; P=0.30) (Table 4).

DisCussion
Among 70 patients with a clinical suspicion of PE adjudicated in the PROTECT trial (15), agreement was poor among adjudicated categories of PE and each of the Geneva score and three Wells scores.Across    the three adjudicated categories of PE, there were no significant differences in total Geneva scores, Wells scores, Modified Wells scores or Simplified Wells scores.We conclude that many physiological variables used in these models, shown to be valuable in the ambulatory setting, are of questionable utility when applied to intubated, critically ill patients.Two possible explanations for the poor agreement among adjudicated categories and prediction scores relate to the population we studied and the quality of information in the medical charts.Regarding the first explanation, some of the variables required to calculate these scores cannot be discerned in critically ill patients (eg, dyspnea) and are, therefore, nondiscriminatory.All patients in PROTECT were considered to be 'at risk for' DVT and PE.It is possible that the incidence of venous thromboembolism in this trial could be decreased as a consequence of universal prophylaxis with either UFH or LMWH.Furthermore, it is possible that patients receiving thromboprophylaxis may have different signs and symptoms of PE than patients not receiving prophylaxis, although, to our knowledge, this has not been studied.Bahia and Albert (13) demonstrated that these clinical scores accurately predict PE in prophylaxed hospitalized patients.There have been no studies in the literature that document the utility of these scores in thromboprophylaxed critically ill patients.
Regarding the issue of information quality, the original data for this substudy, and the original trial data collected by research coordinators on which the adjudication was based, were from patient medical records, which are known in both in paper and electronic formats to contain errors of over- (18) and under-( 19) documentation on the part of nurses (20), trainees (21,22) and staff physicians (23).In other words, some components of the scoring systems may be too challenging to detect in ICU patients, whereas other predictor variables may be present or absent, but the information may not be recorded in the medical charts.
Strengths of the present study include the development, testing and refinement of data abstraction tools in a pilot exercise that documented excellent inter-rater reliability before starting the full study.To avoid ascertainment bias, data for the scoring systems were abstracted blinded to study drug, participating centre and adjudicated outcome.Similarly, the original PE events were adjudicated in quadruplicate, blinded to study drug, participating centre and other adjudicator assessments.Given the lack of previous evidence to evaluate well-known scoring systems for the diagnosis of PE in critically ill patients, we designed the present study a priori to examine the utility of these diagnostic scores in the ICU setting.
Limitations to the present study include the relatively small number of patients with PE.Because this thromboprophylaxis trial did not protocolize screening for PE, as per usual practice, not all patients underwent the same testing to diagnose PE (15).We did not compare scores in patients with suspected PE, but were subsequently proven not to have PE based on objective testing, or patients with no suspicion whatsoever of PE, thereby replicating practice.However, silent PE may be relatively common, as suggested in a recent study involving 176 medical-surgical ventilated ICU patients requiring thoracic computed tomography.In this cohort, 33 (18.7%) had PE, including 20 (61%) with no clinical suspicion (24).Unlike some patients studied in the original prediction score studies, ICU patients in the present study were all receiving either UFH or LMWH thromboprophylaxis and had poor cardiopulmonary reserve, which may have influenced the threshold of concerns when various signs or symptoms were found.The original prediction score studies were also not necessarily completed in thromboprophylaxed patients; however, they have been found to be discriminatory in this population (13).
Our findings may not be generalizable to all types of critically ill patients (eg, trauma, neurosurgery or cardiac surgery, who were not enrolled in the present study).
We did not apply the prediction scores to all patients, including those in whom PE was never considered as a clinical problem, as per the original score development process.We chose our methods to best approximate the use of the prediction scores in some critical care practices, abstracting data for these scores in patients in whom the clinical suspicion of PE existed.
Developing PE scoring systems in the future would ideally incorporate other tests such as troponin, B-type natriuretic peptide values or echocardiographic findings.Specifically, elevated troponin levels predict short-term mortality as shown in a meta-analysis of 20 studies in general hospitalized patients with acute PE and normal blood pressure (OR 5.24 [95% CI 3.28 to 8.38]) (25).Another meta-analysis demonstrated higher risk of death associated with specific echocardiographic findings (OR 2.4 [95% CI 1.3 to 4.3[) and elevated B-type natriuretic peptide (OR 7.7 [95% CI 2.9 to 20]) in patients with hemodynamically stable PE (26).

ConClusion
Pretest probability models developed and validated outside the ICU setting do not correlate with clinically suspected PE in the ICU.Further clinical research is needed to identify features that help to reliably identify patients with PE in this setting, and to develop a practical clinical prediction rule for critically ill patients with suspected PE.Ideally, the latter would incorporate a complete spectrum of risk using readily available clinical, physiological and laboratory tests.Such new prediction models could be of great value to supplement clinical judgment, and aid in more timely identification and appropriate treatment of PE and, possibly, improved patient outcomes.
authoR ContRibutions: DJC is the guarantor of the content of the manuscript, including the data and analysis.CK and MD constructed and piloted the case report forms.CK and DJC conceived of the study and drafted the manuscript.DJC, MM, SM and RH adjudicated the patients with PE.NZ coordinated global data collection.CY contributed to manuscript editing regarding statistical analysis.DHA conducted the statistical analysis.All authors contributed to the PROTECT trial and this substudy, and read and approved the final manuscript.The authors thank the patients and families who agreed to participate in this study, and the Research Coordinators and Site Investigators in all participating centres (see list).The authors thank Chenglin Ye for helpful comments on earlier drafts of this manuscript and the anonymous peer reviewers for their assistance and suggestions reviewing earlier iterations.

*
Diagnosed before intensive care unit admission.APACHE Acute Physiology and Chronic Health Evaluation

TablE 1 Definitions of pulmonary embolism (PE) adjudication categories PE category Definition Possible
Possible PE was defined as low clinical suspicion (low pretest probability) and a nondiagnostic test for PE.'No test for PE' is not part of the definition of a possible PE because the clinical concern had to be sufficient to order a test unless the patient was moribund or preterminal

:
This study was funded by the Canadian Institutes for Health Research, the Heart and Stroke Foundation of Canada, the Australian and New Zealand College of Anesthetists Research Foundation and Physicians Services Incorporated.R Fowler is a Clinician Scientist of the Heart and Stroke Foundation.M Crowther holds a Career Investigator Award from the Heart and Stroke Foundation of Ontario and the Leo Pharma Chair in Thromboembolism Research at McMaster University and St Joseph's Healthcare, Hamilton.Dr Cook is a Research Chair of the Canadian Institutes of Health Research.Dr Crowther sat on advisory boards for Leo Pharma, Pfizer, Bayer, Boehringer Ingelheim, Alexion, CSL Behring and Artisan Pharma, and received funding for a presentation from Leo Pharma.Dr Crowther's institution has received funding for research projects from Boehringer Ingelheim, Octapharm, Pfizer and Leo Pharma.There are no financial/nonfinancial disclosures or conflicts of interest for any of the other authors.Alberta • Dr Kosar Khwaja, Laura Banici, Carole Sirois, Lena Havell; Pharmacists Gilbert Matte and Kathleen Normandin; Montreal General Hospital, Montreal, Quebec • Dr Gordon Wood, Fiona Auld, Leslie Atkins; Pharmacist John Foster-Coull; Vancouver Island Health Authority, Vancouver, British Columbia • Drs Olivier Lesur and Francois Lamontagne, Sandra Proulx; Pharmacist Sylvie Cloutier, Brigitte Bolduc, Marie-Pierre Rousseau, Julie Leblond; Sherbrooke University Hospital and Centre de Recherche Clinique Étienne-Le Bel, Sherbrooke, Quebec • Dr Kosar Khwaja, Laura Banici, Carole Sirois, Lena Havell; Pharmacists Gilbert Matte and Kathleen Normandin; Royal Victoria Hospital, Montreal, Quebec • Drs Gerald Hollinger and Vasanti Shende, Vanessa Belcastro; Pharmacist Jane Martin; Guelph General Hospital, Guelph, Ontario • Dr Bill Plaxton, Anders Foss; Pharmacy Technicians Heather McDougall, Sharon Morris and Goran Petrovic; Grand River Hospital, Kitchener, Ontario • Dr Bojan Paunovic, Kym Wiebe, Nicole Marten; Pharmacist Denise Sawatzky; St Boniface Hospital, Winnipeg, Manitoba • Dr Jonathan Eisenstat, Tammy Doerle; Pharmacist Linda Skinner; Lakeridge Health, Oshawa, Ontario • Drs Steven Reynolds and Sean Keenan, Sheilagh Mans; Pharmacist Ray Jang; Surrey Memorial Hospital, Surrey, British Columbia • Dr Michael Sharpe, Mona Madady; Pharmacist Chandika Mankanjee; London Health Sciences Center, London, Ontario australian investigators • Drs Jamie Cooper (Lead) and Andrew Davies, Shirley Vallance, Cindy Weatherburn, Jasmin Board, Victoria Bennett; Pharmacists Anne Mak and Sook Wern Chua; Alfred Hospital, Melbourne • Drs Simon Finfer and Naresh Ramakrishnan (deceased), Simon Bird, Julie Potter, Anne O'Connor, Susan Ankers; Pharmacist Maggie Gibson; Royal North Shore Hospital, Sydney • Dr Jack Cade, Deborah Barge, Tania Caf, Belinda Howe; Pharmacist Emma Michael; Royal Melbourne Hospital, Melbourne • Dr Rinaldo Bellomo, Glenn Eastwood, Leah Peck, Donna Goldsmith, Kim O'Sullivan; Lead Pharmacists Dr Michael Ching, Jean Schmidt, Mei Ho and Bailey Lim; Austin Hospital, Melbourne • Drs David Ernest, Sam Radford, Ann Whitfield and Anthony Cross, Suzanne Eliott, Jaspreet Sidhu, Belinda Howe, Inga Mercer, Angela Hamilton (deceased); Pharmacist Paula Lee; Box Hill Hospital, Melbourne • Dr John Botha, Jodi Vuat, Sharon Allsop, Nina Fowler; Pharmacist Chui Yap; Frankston Hospital, Frankston • Drs Tim Crozier, Jonathan Barrett and Chris Wright, Pauline Galt, Carly Culhane, Rebecca Ioannidis, Sue Burton, Marnie Reily, • Dr Chip Doig, Linda Knox, Crystal Wilson, Kevin Champagne; Pharmacist Angela Kayall Peters; Calgary University Foothills Hospital, Calgary, Alberta • Dr Niall Ferguson, Andrea Matte, James Stevenson, Joel Elman, Madison Dennis; Pharmacist Jenn Tung, Robert Solek, Kim De Freitas, Nga Pham; University Health Network, Toronto Western Hospital, Toronto, Ontario