Community-acquired pneumonia (CAP) is common and associated with significant mortality [
The most widely known, well-validated, and commonly used risk prediction models are CURB-65 [
In this systematic review, we provide a comprehensive and up-to-date overview of the existing published risk prediction models for mortality in community-acquired pneumonia. We did not include scores which were designed to predict ventilatory and vasopressor support because of the inconsistency in decisions to provide these therapies depending on treatment site. We also aim to summarize the key features of each model such as variables used, risk stratification, and the comparative performance in terms of sensitivity, specificity, balanced accuracy, and area under the curve (AUC) values so that practitioners can make an informed choice.
We selected studies that were the first to report the derivation or validation of each risk prediction model for predicting mortality in CAP. There was no restriction on the type of study (prospective or retrospective) or country of origin. For pragmatic reason, we excluded studies that aimed to carry out further testing of risk models systems that had already been validated once and reported, as there are several validation studies for commonly used scores such as PSI and CURB-65. In such instances, we have used pooled data from published meta-analyses where available [
We searched MEDLINE, EMBASE, and Cochrane Central Register of Controlled Trials with no date limitations in November 2011 using the search terms listed in Supplementary Material 1 available online at
Two reviewers (Chun Shing Kwok, Kenneth Woo) scanned all titles and abstracts to select studies that met the inclusion criteria. Full reports (where available) of potentially relevant studies were retrieved and independently checked by the other two reviewers (Yoon K. Loke, Phyo Kyaw Myint). Where there was any uncertainty or discrepancies, the article was discussed among the reviewers to determine if the studies should be included. We also contacted authors if there were any areas that required clarification. Data were collected using a standardized form by two authors independently (Chun Shing Kwok, Kenneth Woo), and this was checked by Yoon K. Loke. Data were collected on score name, setting for score application, year of study, country of origin, participant selection criteria, methodology for diagnosis of pneumonia, outcomes assessed, definition of severe pneumonia, participant characteristics, lost to followup in study, and the results. Data relating to study methodology were also collected for the quality assessment such as risk of confounding and statistical methods. The primary measure of interest was the area under the receiver operating curve (AUROC) as this reflects the overall discriminant ability of the risk prediction model; where this was not reported, we calculated balanced accuracy based on the following equation (sensitivity plus specificity) divided by two.
We also extracted results of existing meta-analyses on pneumonia risk prediction models [
Quality assessment was performed by Chun Shing Kwok using a methodological checklist for prognostic studies from the National Institute for Heath and Clinical Excellence [
Due to the nature of this systematic review, we did not intend to conduct meta-analysis but planned to summarize the main findings descriptively in tables and figures. In particular, we evaluated key performance parameters (AUROC, balanced accuracy, sensitivity and specificity) for each scoring system and depicted this graphically according to the frequency of variables required for the calculation of the score. For these plots, we used validation study or meta-analysis results where available. We conducted additional subgroup analysis restricted to studies that used prospectively collected datasets, which may potentially be of greater validity than retrospective evaluations.
From the 1,947 titles and abstracts, 93 articles were selected for detailed review (Figure
Characteristics of derivation and validation studies which predict pneumonia mortality.
Paper | Score | Design | Setting | Year | Country | Inclusion | CAP diagnosis | Mortality outcome |
---|---|---|---|---|---|---|---|---|
BTS 1987 [ |
British Thoracic Society Score 1, 2, 3 | Prospective | Hospital | November 1982 to December 1983 | UK | Adults aged 15–74 years with pneumonia | Acute illness with radiological pulmonary shadowing which was neither preexisting nor of another known cause. | Mortality |
| ||||||||
Farr et al. 1991 [ |
British Thoracic Society Score 1, 2, 3 | Retrospective | Hospital | January 1984 to 1986 | United States | Adults aged from 15 to 80 years with the diagnosis of pneumonia | Acute respiratory illness contracted in the community and accompanied by a new radiographic infiltrate | Mortality |
| ||||||||
Leroy et al. 1996 [ |
Mortality risk index | Combined retrospective and prospective | ICU | Derivation January 1987–December 1992. Validation January 1993–December 1994 | France | Adult patients aged >16 admitted to the intensive care and infectious disease unit with the diagnosis of CAP | Admission from home or a nursing home with the presence of pulmonary infiltrate on CXR and acute onset of clinical features of pneumonia | Mortality in ICU |
| ||||||||
Neill et al. 1996 [ |
CURB | Prospective | Hospital | July 1992 to 1993 | New Zealand | Adults with pneumonia without severe immunosuppression | Acute illness radiographic pulmonary shadowing with neither preexisting nor another known cause | Mortality |
| ||||||||
Fine et al. 1997 [ |
Pneumonia severity index | Prospective | Hospital (inpatients and outpatients) | 1989, 1991–1993 | United States and Canada | Adults aged >18 years with diagnosis of pneumonia | ICD-9-CM diagnosis of pneumonia | 30-day mortality |
| ||||||||
Lim et al. 2003 [ |
CURB-65, CRB-65 | Retrospective analysis of prospectively collected data | Hospital | 1998–2000 | UK, New Zealand, and The Netherlands | Adults with CAP | Acute respiratory tract illness associated with radiographic shadowing on an admission chest radiograph | 30-day mortality |
| ||||||||
Ewig et al. 2004 [ |
Modified American Thoracic Society Rule | Prospective | Hospital | June 1998–May 2001 | Spain | All patients presenting with CAP in a university hospital between June 1998 and May 2001 | New pulmonary infiltrate with symptoms and signs of a lower respiratory tract infection | 30-day mortality |
| ||||||||
Myint et al. 2006 [ |
SOAR | Prospective | Hospital | NA | UK | Clinical features of pneumonia and new CXR shadow | Clinical features of pneumonia and new CXR shadow | 42-day mortality |
| ||||||||
Myint et al. 2007 [ |
CURB age | Prospective | Hospital | NA | UK | Clinical features of pneumonia and new CXR shadow | Clinical features of pneumonia and new CXR shadow | 42-day mortality |
| ||||||||
Escobar et al. 2008 [ |
Abbreviated Fine Score | Retrospective | Hospital | 2000–2002, 2004-2005 | United States | All nonobstetric, nonpsychiatric patients aged >18 years with pneumonia | ICD codes defined by Fine et al | 30-day mortality |
| ||||||||
Shindo et al. 2008 [ |
A-DROP | Retrospective | Hospital | November 2005–January 2007 | Japan | Patients with CAP | Pneumonia in a patient who was not hospitalized and who was carrying on with activities of daily living | 30-day mortality |
| ||||||||
Myint et al. 2009 [ |
CURB age | Prospective | Hospital | 2006–2008 | UK | Patients with CAP | Acute illness with clinical features of lower respiratory tract infection characterized by new radiographic shadowing | 30-day mortality |
| ||||||||
Myint et al. 2009 [ |
CURSI |
Retrospective | Hospital | September 2004 to July 2005 | UK | Patients with CAP | ICD-10 codes diagnosis of pneumonia | Inpatient mortality |
| ||||||||
Rello et al. 2009 [ |
PIRO score | Prospective | ICU | NA | Spain | Patients aged >18 years with pneumonia | Pneumonia confirmed by CXR and clinical findings | 28-day mortality |
| ||||||||
Liapikou et al. 2009 [ |
IDSA/ATS 2007 | Prospective | Hospital | January 2000–2007 | Spain | Patients aged >15 years who were admitted to the emergency department for CAP in a university hospital from January 2000 through 2007 | New pulmonary infiltrate on admission chest radiograph and symptoms and signs of lower respiratory tract infection | 30-day mortality |
| ||||||||
Uchiyama et al. 2010 [ |
PARB | Retrospective | Hospital | March 2006 to November 2008 | Japan | Adult patients with CAP | Unclear | 30-day mortality or needing >2 weeks of oxygen therapy |
| ||||||||
Myint et al. 2010 [ |
CURSI, CURASI | Prospective | Hospital | 2006–2008 | UK | Clinical features of pneumonia and new CXR shadow | Clinical features of pneumonia and new CXR shadow | 42-day mortality |
| ||||||||
Musonda et al. 2011 [ |
CARSI, CARASI | Prospective | Hospital | 2008 | UK | Patients with clinical and radiological features of CAP from 3 hospitals in the UK | Clinical features of pneumonia (cough, sputum, and shortness of breath, with or without fever) and new CXR shadow | 30-day mortality |
ICU: intensive care unit; CXR: chest X-ray; CAP: community-acquired pneumonia.
Search results and study selection.
Study validity is summarized in Supplementary Material 4. One major limitation is that only 14 of the risk prediction models had validation data, whereas 6 reported findings from derivation studies (SOAR, AFSS, PARB, PIRO, CARSI, and CARASI) without further validation [
The frequency of variables which were used more than once in the models and their occurrence in individual scores is shown in Table
Frequency of variables used in prognostic or severity scores in community-acquired pneumonia.
Score | Patient characteristics | Clinical variables | Laboratory measures | Radiological findings | Management | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Age | Gender | Immuno |
Renal disease | Pulse | BP | RR | Temp | Shock | Confusion | Urea/BUN | WCC | PaO2/SaO2 | Haematocrit | Glucose | Sodium | pH | Pleural effusion | Multilobar pneumonia | Mechanical ventilation | |
BTS 1 | + | + | + | |||||||||||||||||
BTS 2 | + | + | + | |||||||||||||||||
BTS 3 | + | + | + | + | ||||||||||||||||
MRI | + | + | + | + | ||||||||||||||||
CURB | + | + | + | + | ||||||||||||||||
PSI | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | |||||
CURB65 | + | + | + | + | + | |||||||||||||||
CRB65 | + | + | + | + | ||||||||||||||||
mATS | + | + | + | + | + | |||||||||||||||
SOAR | + | + | + | + | ||||||||||||||||
AFSS | + | + | + | + | + | + | + | + | + | + | + | + | ||||||||
A-DROP | + | + | + | + | + | + | ||||||||||||||
CURB-age | + | + | + | + | + | |||||||||||||||
PIRO score | + | + | + | + | + | |||||||||||||||
IDSA/ATS 2007 | + | + | + | + | + | + | + | + | + | + | ||||||||||
PARB | + | + | + | |||||||||||||||||
CURSI | + | + | + | + | + | |||||||||||||||
CURASI | + | + | + | + | + | + | ||||||||||||||
CARSI | + | + | + | + | + | |||||||||||||||
CARASI | + | + | + | + | + | + |
BP: blood pressure; RR: respiratory rate; BUN: blood urea nitrogen; WCC: white cell count.
Some of the risk prediction models also required more complex concepts involving clinical interpretation and decision-making or even the results of other severity prediction tools. The MRI score included the Glasgow coma score, judgment on underlying ultimately or rapidly fatal illness, simplified acute physiology score, acute organ system failure, and ineffective initial antimicrobial treatment. The modified ATS score had major criteria of requirement for mechanical ventilation or septic shock, and the IDSA/ATS 2007 score included receipt of invasive mechanical ventilation and septic shock and the need for vasopressors. These models were therefore considered separately.
The results from the included derivation and validation studies are shown in Table
Results of derivation and validation studies for pneumonia severity scores.
Paper | Score | Patients | Age | % male | Lost to followup | Results |
---|---|---|---|---|---|---|
BTS 1987 [ |
British Thoracic Society Score 1, 2, 3 | 511 patients | 48.4 | 60.5 | 28 lost to followup | Derivation: |
| ||||||
Farr et al. 1991 [ |
British Thoracic Society Score 1, 2, 3 | 245 patients | 58.9 | 55 | None | Validation: |
| ||||||
Leroy et al. 1996 [ |
Mortality risk index | 460 patients, 335 derivation, 125 validation | 62.5 | 64.3 | None | Derivation: 62% sensitivity, 92% specificity, 74% PPV |
| ||||||
Neill et al. 1996 [ |
CURB | 255 patients | 58 | 55 | 6 patients, no consent was obtained | Derivation: |
| ||||||
Fine et al. 1997 [ |
Pneumonia severity index | 14199 derivation, 38039 validation | NA | 51 | None | Derivation: PSI area ROC 0.84 |
| ||||||
Lim et al. 2003 [ |
CURB-65, CRB-65 | 1068 patients | 64 | 51.5 | None | Derivation: |
| ||||||
Ewig et al. 2004 [ |
Modified American Thoracic Society Rule | 696 patients | 67.8 | 66 | 21 patients had treatment setting not documented and were excluded | Validation |
| ||||||
Myint et al. 2006 [ |
SOAR | 195 patients | 77 (median) | 57 | None | Derivation: |
| ||||||
Myint et al. 2007 [ |
CURB age | 189 patients | 75 (median) | 56.1 | None | Derivation: |
| ||||||
Escobar et al. 2008 [ |
Abbreviated Fine Score | 11030 and 6147 patients | 71.3 | 51.2 | None | Derivation: |
| ||||||
Shindo et al. 2008 [ |
A-DROP | 371 patients | 75 | 59.9 | 42 (lack data) | Validation: |
| ||||||
Myint et al. |
CURB-age | 190 patients | 76 (median) | 53 | None | Validation full cohort: |
| ||||||
Myint et al. |
CURSI, CURASI | 118 | 75 (median) | 51.7 | None | Only 1 patient died during hospital stay and the patient was scored severe by CURSI, CURASI, and CURB-65 |
| ||||||
Rello et al. 2009 [ |
PIRO score | 529 patients | NA | NA | None | Derivation: |
| ||||||
Liapikou et al. 2009 [ |
IDSA/ATS 2007 | 2391 patients | 66.7 | 61.4 | 289 missing data | Validation: |
| ||||||
Uchiyama et al. 2010 [ |
PARB | 243 patients | NA | NA | None | Derivation: |
| ||||||
Myint et al. 2010 [ |
CURSI, CURASI | 190 patients | 76 (median) | 53 | None | Validation full cohort: |
| ||||||
Musonda et al. 2011 [ |
CARSI, CARASI | 190 patients | 76 (median) | 53 | None | Derivation: |
URB: urea, respiratory rate, blood pressure; CRB: confusion, respiratory rate, blood pressure; COUW: confusion, oxygen, urea, white cell count; PPV: positive predictive value; NPV: negative predictive value.
Four scores (BTS 1, CRB-65, CARSI, and CARASI) [
Nine prognostic models (BTS2, BTS3, CURB, CURB-65, A-DROP, CURB-age, SOAR, CURSI, CURASI) [
Four models (PSI, AFSS, PIRO, and PARB) [
Three models (MRI, mATS, and IDSA/ATS 2007) [
The comparative performance of the risk prediction models according to number of prognostic variables is summarized graphically in Figure
Balanced accuracy and area under ROC of pneumonia severity scores versus number of variables.
Sensitivity and specificity of pneumonia severity scores by a number of variables.
Our review systematically evaluates and summarizes 20 risk prediction models for mortality prediction which included variables required for score calculation in patients with pneumonia so that clinicians and policy makers (such as guideline committees and health services researchers) can make informed choices about the ease of use and comparative predictive ability. In these times of uncertainty in the health economy, the number and type of variables required for calculation need to be weighted up against the outright performance. Here, the ease of implementation, efficient resource utilization, and availability/simplicity of testing within healthcare setting (e.g., community centre, or emergency department, or intensive care unit) may represent influential factors in determining the suitability of a particular model.
We found that most of the published models (irrespective of complexity) yielded fairly similar performance with regard to balanced accuracy and AUC. While there may be some statistical differences in AUC, this may only have limited consequence when clinicians are making treatment decisions in individual patients. For instance, in Chalmer’s meta-analysis, the respective AUCs indicate that the probability of PSI correctly discriminating between patients of differing severity was 0.82, whilst the corresponding figure for CURB-65 was 0.79. We have deliberately chosen to emphasize overall performance here with balanced accuracy or AUROC because while certain models may have demonstrably superior sensitivity, others had better specificity, thus illustrating the inevitable trade-off effect between sensitivity and specificity. The choice of appropriate model will therefore depend on whether healthcare teams place greater weight on sensitivity or specificity. Given the small differences between certain scoring systems, clinicians may equally prefer to either pragmatically adopt the simplest model (appropriate to their healthcare setting) or opt for the best established and widely validated systems.
We presented both results for balanced accuracy and ROC in order to allow the comparison of the performance of each score. Balanced accuracy considers both the predictive value of sensitivity and specificity. While the ROC is a better measurement of predictive value than balanced accuracy, several studies reported sensitivity and specificity rather than ROC.
The majority of the studies were evaluated in hospital settings, but one study included both inpatients and outpatients and two studies were conducted in intensive care settings. The PSI was studied in both inpatient and outpatient settings which has an advantage because its findings can be generalisable to both of these settings [
Our systematic review also identified some key gaps in the existing research. One particular issue is the lack of validation data for several models. Given the diversity of patient populations and the heterogeneity seen in the meta-analyses of PSI and CURB-65, there is no guarantee that a model that performs well in one setting will do equally well in a different setting. It would be very helpful if the profusion of recently proposed models (often based only on data from a single centre) could be compared directly against older versions in a large multicentre international cohort.
The existing studies do not report on acceptability, uptake, and clinical impact of risk prediction tool in the routine clinical management of patients with pneumonia. Perry et al. conducted a survey of emergency physicians’ requirements for clinical decisions rule for acute respiratory illnesses [
While the performance of a prediction rule is a major criterion for comparative superiority, simplicity is a very important determinant of potential clinical application. A survey conducted in Australia found that only 12% of respiratory physicians and 35% of emergency physicians reported using the PSI always or frequently even though it is recommended by the Australasian Therapeutic Guidelines [
Our review has a number of strengths. We conducted a systematic search to cover all scores including those that are established as well as those that have yet to be validated. Also, there was no restriction of the country of score origin and we were able to capture the scores from around the world. Our review also has a number of limitations, including difficulty in finding exact search terms to pick up this type of study. We only included initial derivation and first validation studies for the scores identified. Some of the scoring systems do not appear to have been validated yet. Here, there is a definite possibility of publication bias where studies showing the most favorable predictive ability were likely to be accepted for publication sooner than equivocal or less impressive data. In order to reduce the possibility of such bias, we were able to include two systematic reviews [
Since there already exist established models (CRB-65, CURB-65, and PSI) with reasonable to good discriminative ability across a wide range of settings and only small incremental differences between these and newer scores, further research should mainly focus on why patients get misclassified and whether we can identify important variables within them to improve sensitivity of current models. Equally, the uptake of risk prediction models in routine clinical practice and any relationship with improved patient outcomes need to be rigorously assessed, perhaps through cluster-randomized controlled trials of different care pathways. These future trials should test if clinical decisions based on pneumonia scores are associated with better patient outcomes compared clinical decisions based on clinical judgment. Scores should also be tested in developing countries as pneumonia mortality is high in the regions. Eventually, the goal should be to clarify the entire pathway for community-acquired pneumonia management and the role of risk prediction models for each stage in the community, at the emergency department, on hospital wards, and in intensive care.
Although there are a multitude of proposed risk prediction models, few have undergone proper validation, and no convincing evidence exists that the overall discriminative ability improves upon the well-established CURB-65 and PSI models. Future research should thus focus on randomized trials to test if clinical decision rules using existing risk prediction models and guided treatment pathways can significantly improve pneumonia outcomes.
The authors declare there is no conflict of interests.
Chun Shing Kwok, Yoon K. Loke, and Phyo Kyaw Myint conceptualized the review and developed the protocol. Chun Shing Kwok, Yoon K. Loke, Kenneth Woo, and Phyo Kyaw Myint selected studies and abstracted the data. Chun Shing Kwok and Yoon K. Loke carried out the synthesis of the data and wrote the paper with critical input from Phyo Kyaw Myint. Yoon K. Loke acts as guarantor for the paper.