Validation of a new questionnaire with generic and disease-specific qualities : The McGill COPD Quality of Life Questionnaire

1University of Ottawa, The Ottawa Hospital, The Ottawa Hospital Research Institute, Ottawa, Ontario; 2Department of Epidemiology, Biostatistics and of Occupational Health, Faculty of Medicine; 3School of Physiotherapy and Occupational Therapy, McGill University, Montreal, Quebec; 4Centre for Applied Health Research and Evaluation, University of British Columbia, Vancouver, British Columbia; 5Institut universitaire de cardiologie et de pneumologie de Québec, Université Laval, Québec; 6Mount-Sinai Hospital, McGill University affiliated teaching hospital, McGill University, Montréal; 7Centre hospitalier universitaire associé, Université Laval; 8Respiratory Epidemiology and Clinical Research Unit, Montreal Chest Institute, McGill University Health Centre, McGill University, Montreal, Quebec Correspondence: Dr Jean Bourbeau, Respiratory Epidemiology and Clinical Research Unit (RECRU), Montreal Chest Institute, 3650 St Urbain Street, Montréal, Québec H2X 2P4. Telephone 514-934-1934 ext 32185, fax 514-843-2083, e-mail jean.bourbeau@mcgill.ca Chronic obstructive pulmonary disease (COPD), although preventable and treatable, is not curable. The mean (± SE) worldwide prevalence of COPD is 10.1±4.8% (11.8±7.9% for men and 8.5±5.8% for women) (1). It is the fourth leading cause of death following heart disease, cancer and stroke. Moreover, COPD mortality is steadily increasing (2). The severity of COPD is graded into four stages according to the Global initiative for chronic Obstructive Lung Disease (GOLD) (3) criteria. However, severity of airway obstruction, as described by the GOLD categories, does not necessarily translate into subjects’ perceptions of severity of their disease (4-6). An ideal instrument would reflect not only physiological function, but also be closely related to OriGinaL arTiCLe


Validation of a new questionnaire with generic and disease-specific qualities: The McGill COPD Quality of Life Questionnaire
Smita Pakhale MD MSc 1 , Sharon Wood-Dauphinee PhD 2,3 , Adriana Spahija PhD 3 , Jean-Paul Collet MD PhD 4 , François Maltais MD 5 , Sarah Bernard MSc 5 , Marc Baltzan MD MSc 6 , Michel Rouleau MD 7 , Jean Bourbeau MD MSc 2,8 ; for the Respiratory Health Network of the FRSQ the biopsychosocial consequences of the disease and patients' ability to cope with the demands of daily living.Thus, measuring health-related quality of life (HRQL) is important in both clinical practice and research.Clinicians and policy makers have recognized the importance of measuring this construct to inform patient management and policy decisions (7,8).A cure for COPD is not in sight; hence, improvement in HRQL is the major treatment goal.Currently, there are two types of HRQL questionnaires: disease specific and generic.Both types have strengths and weaknesses (5).For COPD, they are currently often administered together (9) because no instrument incorporating both generic and disease-specific constructs for use with COPD patients is available.However, the administration of two separate questionnaires is time consuming, expensive, may be irritating to patients and can increase the likelihood of nonparticipation.Disease-specific quality of life questionnaires commonly used in COPD (10)(11)(12)(13) are the Chronic Respiratory Questionnaire (14) and the St George's Respiratory Questionnaire (SGRQ) (15), although many others are available (16).The Chronic Respiratory Questionnaire is an individualized, labour-intensive instrument requiring 20 min to 30 min of staff time for each administration.A selfadministered version, with standardized dyspnea items, is now available and takes 5 min to 10 min to complete (17).However, the standardized version has reduced responsiveness compared with the individualized version (18).The SGRQ is the most commonly used instrument and has 76 items (19).However, the majority (80%) of questions in the SGRQ have dichotomous (yes/no) responses and, hence, the instrument may be less sensitive to change.Importantly, the two-point scale is less reliable, less interesting and more ambiguous to respondents (20)(21)(22) than the scales with more categories.To supplement diseasespecific questionnaires and to assess constructs related to generic quality of life, the Medical Outcomes Study Short Form-36 (SF-36) is most commonly used in COPD.It has been translated into more than 80 languages, takes 10 min to complete and results obtained can be easily compared with existing population norms (23,24) (Table 1).
The McGill COPD Questionnaire (McGill COPD Q) is a newly developed, hybrid and unique instrument that was designed by combining items from a generic questionnaire -the SF-36 -and a respiratory disease-specific module (25).This questionnaire was developed as a bilingual (English and French), self-administered instrument using subjects with moderate to severe COPD undergoing pulmonary rehabilitation.The decision to use the SF-36 was due to its brevity, relevance, and availability in many languages and cultures.
The objective of the present study was to evaluate the reliability, validity, responsiveness and clinical interpretation of the new hybrid questionnaire (disease-specific items supplemented with generic items from the SF-36), the McGill COPD Q, in subjects with moderate to severe COPD.It was hypothesized that the McGill COPD Q would behave similarly to the SGRQ -a disease-specific questionnaire.

Psychometric evaluation of the McGill COPD Q Overview:
The present study was embedded in a Quebec cohort of COPD patients participating in a pulmonary rehabilitation program; patients were followed for three years.The pulmonary rehabilitation program included six to eight weeks of a supervised exercise program at an academic hospital.A new hybrid McGill COPD Q combined 12 preselected items from SF-36 with 17 items from the previously developed COPD-specific module (25).Currently, a continuation of this project uses an instrument to measure HRQL that incorporates both generic and disease-specific constructs for use with COPD patients (25).

Subject selection
Subjects were selected from four participating centres across Quebec.The inclusion criteria were: a clinical diagnosis of COPD; older than 40 years of age; currently or previously smoking, with a smoking history of at least 10 pack-years; forced expiratory volume in 1 s (FEV 1 ) after the use of a bronchodilator <80% of the predicted normal value and FEV 1 to forced vital capacity (FVC) ratio <70%; French or English speaking; and willing to consent to participate in the study.Patients were excluded if they had a primary diagnosis of asthma, heart failure requiring treatment, dementia or unstable psychological condition, and an acute medical condition that was a contraindication to an exercise program.Patients were also required to have completed the baseline evaluation (before rehabilitation) and one evaluation following the completion of the rehabilitation program (within one to two months).The Pulmonary Rehabilitation Program at the Montreal Chest Institute (Montreal, Quebec) is considered to be a state-of-theart, multidisciplinary program at a leading academic institution (26).

Measurements
Patients' assessments included a complete medical history, pulmonary function tests at rest, cycle endurance testing (CET) on a stationary ergocycle, 6 min walk test (6MWT) and HQRL measured by the generic SF-36, the disease-specific SGRQ and the new McGill COPD Q.Data collected at each respective centre were centralized in one place.

Pulmonary function test
Spirometry and lung volumes were measured at rest according to American Thoracic Society guidelines (27,28).Results were compared with predicted normal values from the European Community for Coal and Steel/European Respiratory Society (29).

CET
For the prerehabilitation evaluation, the CET was performed on an electromagnetically braked cycle ergometer and the workload was set at 80% of peak work capacity achieved during incremental cycle ergometry.Patients were asked to cycle for as long as possible, and no encouragement was provided during the tests to avoid any potential confounding effect on exercise performance (30).

6MWT
The 6MWT was administered in a standardized manner (31) using an elliptical walking course at each participating centre.Two tests were performed with sufficient rest periods between tests (at least 20 min).Results were reported in metres as the best of the two trials.

HRQL
Health status was evaluated using version 2 of the self-administered SF-36, the SGRQ (15) and the new McGill COPD Q. Raw scores from the SF-36 were converted to standardized scores as per the users manual (32).The final scores from the SF-36 were reported as eight domain scores 0 to 100 and two summary scores: Physical Health and Mental Health.The final scores ranged from 0 to 100, with a mean of 50 and an SD of 10, with higher scores indicating a better quality of life.SGRQ responses were scored using weights, and scores were converted to a percentage ranging from 0 to 100, with higher scores indicating a lower quality of life.For the new McGill COPD Q, a higher score indicated better quality of life.

Statistical analysis
Results are reported as means and SD.Floor and ceiling values of the items and nonresponse rates were evaluated as percentages.The values of the missing data were imputed based on the mean scores in the given subscale if more than 50% of the items were answered in that subscale; the same method advised the SF-36 (32,33).Total percentages of missing values per question and per subject were calculated.Two types of reliability were estimated on the total score and subscales: internal consistency using Cronbach's alpha coefficients (34) and test-retest reliability.The latter was calculated by comparing the consistency of scoring of the new McGill COPD Q administered on two occasions (one to two weeks apart) using one-way ANOVA, with subjects as a random factor to obtain variance estimates and an estimator of the intraclass correlation coefficient (ICC) (35).
Convergent and divergent validation processes were used because they are concepts well-accepted by experts in the field (8) (36), which is given as mean (postrehabilitation score − baseline score) total group / SD baseline score total group .In addition, responsiveness was assessed by comparing the magnitude and the direction of the change in the total McGill COPD Q score with that of the wellestablished SGRQ (37) after pulmonary rehabilitation.The minimal clinically important difference (MCID) was calculated from the health transition question of the SF-36 (ie, response option 2 -"Somewhat better now than one year ago" and option 4 -"Somewhat worse now than one year ago".For statistical analysis, version 2.7.1 of the freeware R (38) was used.

Subject characteristics
A total of 246 subjects participated in the pulmonary rehabilitation cohort; 142 completed all of the assessments and questionnaires for the present validation study.Baseline sociodemographic and clinical characteristics of subjects in the validation study are summarized in Table 2. Characteristics of the subjects in the validation study were similar to the characteristics of the subjects of the entire cohort (data not shown).

Missing data, floor and ceiling effect
Total percentages of missing values per question and per subject were 0% to 2%.For each item, nonmissing data were normally distributed; hence, a mean score imputation strategy was used.The percentage of subjects with maximum (ceiling effect) and minimum (floor effect) scores on McGill COPD Q at baseline for the subscales and the total score are presented in Table 3.

Reliability
Cronbach's alpha for the subscales of the new McGill COPD Q (symptoms, physical function and feelings) ranged from 0.68 to 0.82.Individual values are presented in Table 3. Forty-eight COPD subjects responded to the pre-established criteria of disease stability (ie, no COPD-related acute exacerbation between the two administrations one to two weeks apart).They completed the new McGill COPD Q twice, one to two weeks apart for test-retest reliability (Table 3).The ICC consistency and ICC agreement yielded the same values for all the subscales and the total score.

Validation
For convergent construct validation, the correlation coefficient comparing the McGill COPD Q scores with SGRQ total score at baseline    1).Individual values for the subscale scores are presented in Table 4.The correlation with the physical function subscale of the SF-36 was 0.66 (95% CI 0.56 to 0.74) and with social function subscale of the SF-36 was 0.61 (95% CI 0.50 to 0.70).For divergent construct validation, the correlation coefficient comparing the McGill COPD Q scores with the pain subscale of the SF-36 was 0.17 (95% CI 0.00 to 0.32).

Responsiveness
After undergoing six to eight weeks of pulmonary rehabilitation, there was improvement in the total mean score of the McGill COPD Q and the SGRQ by six points and by seven points, respectively.Cohen's effect size for the McGill COPD Q and the SGRQ was 0.33 and 0.44, respectively.The MCID calculated from the health transition question of the SF-36 is shown in Table 5. Response option 2 -"Somewhat better now than one year ago" yielded results similar to the SGRQ.However, option 4 -"Somewhat worse now than one year ago" yielded wide CIs because of fewer subjects.

DISCUSSION
The new McGill COPD Q, a novel concept of combining questions from the SF-36 with a COPD-specific module (25), demonstrated high internal consistency, reliability and validity in COPD patients.Responsiveness of the questionnaire was similar to that of the SGRQ, although responsiveness has only been tested for pulmonary rehabilitation.
The amount of missing data in our study was very low (≤3%).This compares very favourably with missing data reported when using the SGRQ (up to 23%) (39).Brevity, clarity of language and the 5-point Likert scale for reducing ambiguity could be the reasons for minimal nonresponse.
In the literature, the floor and ceiling effects for individual SF-36 subscales are reported to be quite high in COPD subjects (40).The new McGill COPD Q did not demonstrate this problem; in fact, fewer than 5% high ceiling or floor effects were present in our study.This is important because if there is a high ceiling (highest possible score on the scale) or floor effect (lowest possible score on the scale), subjects cannot be distinguished from one another because many have the same score.Moreover, these effects reduce reliability because betweensubject variability is decreased among subjects with the highest or lowest scores.In addition, responsiveness may be compromised if positive or negative changes cannot be measured in these subjects if the same subjects have high floor and/or ceiling effects for both pre-and postrehabilitation tests.
As hypothesized, Cronbach's alpha was between 0.7 and 0.9, except for the symptom subscale.Although a lower alpha reflects lower item-to-item correlations, we decided to include all of the items in the symptom subscale (eg, 'During the past four weeks, how often have you coughed? and how often did you bring up phlegm?') to foster high face and content validity for the COPD subjects.Test-retest reliability was generally high (>0.9)except for the symptoms subscale.Interpretation of ICC scores is difficult and varies according to the use of the instrument.Generally speaking, ICC should always be >0.7 to make decisions at the group level (41) and >0.9 at the individual level (41).The low ICC for the symptoms subscale in our study could be due to the homogeneous cohort (ie, the lack of variability among the subjects), although it may also reflect change between the test and retest times.
As hypothesized, convergent validity, which assesses the relationship with a similar construct, was demonstrated when total McGill COPD Q scores were correlated with the SGRQ total scores.Moreover, as anticipated, the correlation between scores on the new questionnaire and those of the pain subscale of the SF-36 was very low.This reflects divergent validity and was to be anticipated because the two scales assess different areas.We also demonstrated that the new questionnaire is a responsive measure.The difference between preand postrehabilitation scores of the McGill COPD Q and those of the SGRQ were similar.Another important issue that was addressed is the clinical interpretation of the new questionnaire score.The MCID was calculated using the health transition question of the SF-36 as the anchor.Apparently, chronically ill COPD subjects need fairly large positive changes in HRQL scores to perceive improvement.However, the perception of worsening is reached with smaller changes in such scores.Unlike the SGRQ, with 80% of items scored dichotomously as yes/no, the McGill COPD Q has items scored on a 5-point Likert scale, making it potentially more sensitive to small changes.This property needs further testing in a study specifically designed to assess it.
The effect size for the McGill COPD Q in the present study was small to moderate (42).However, effect size is known to vary from study to study (6).For example, the effect size of the physical component summary scores of SF-36 was 0.18 for men and 0.07 for women after participation in 12 and 24 weeks of pulmonary rehabilitation in seven outpatient hospital programs from urban and rural settings across North Carolina (USA) (43).The population studied here is homogeneous and the effect size depends on the heterogeneity of the underlying population.Moreover, effect size does not take into account the variability of change.Experts in quality of life research advocate an effect size of 0.2 (minimal effect size) as an appropriate definition of MCID (44).Although the effect size in our study was larger than the MCID advocated by the experts, it requires further investigation and needs to be studied in different settings.
Strengths of the present study include its prospective design with statistical analysis using standard methods, as well as the use of the SGRQ, which has been extensively validated in COPD patients (16).The COPD population in the present study was comprised of a typical sample of patients commonly encountered in routine clinical practice in North America.Thus, the study results can probably be generalized, at least across North America.Pulmonary rehabilitation is a very effective treatment in moderate to severe COPD patients and has been shown to improve HRQL (45).This allows us to carefully assess the sensitivity to improvement but less so to deterioration.
There were limitations to our study.The sample was relatively homogenous, with all subjects having moderate to severe COPD, the majority being exsmokers with a significant smoking history and a median age of 66 years.The concept of quality of life and its implications on daily life are different for men and women (46).Although we recognize this difference, we could not validate the new questionnaire separately for men and women due to the small sample size.The McGill COPD Q also needs to be studied in other ethnic populations and cultures for crosscultural validity.In addition, the responsiveness of this new questionnaire to other interventions and comparisons with other quality of life questionnaires is also required.
Internal consistency was somewhat low for the symptom subscale of the new questionnaire.For questionnaires, a Cronbach's alpha of between 0.7 and 0.9 is considered acceptable in the field of quality of life research (41,46).We believe that having a very high alpha may be more of a problem.A Cronbach's alpha >0.9 suggests item redundancy.Cronbach's alpha for two of the three subscales of SGRQ was >0.95 in a study by Hajiro et al (6), but <0.9 for all three subscales when assessed by Barr et al (39).Finally, we have very little data to estimate the MCID.We compared the McGill COPD Q with the SGRQ, a well-recognized and widely used measure.Although there is no universally accepted approach and no 'gold standard' for determining the MCID, the evaluation should come from multiple perspectives and use different strategies to accumulate evidence.

CONCLUSION
The new McGill COPD Q is reliable, and valid for and responsive to change in subjects with moderate to severe COPD.It is available in English and French.The McGill COPD Q has only 29 items and, thus, is much shorter than the currently available COPD HRQL tools.Questionnaire brevity is important to reduce respondent burden in COPD patients.Furthermore, the new questionnaire could be used as a stand-alone tool to assess patient-reported outcomes in COPD subjects as opposed to the current practice of using generic and diseasespecific questionnaires together.This may prove to be a definite advantage over the currently available COPD questionnaires.The response to the self-evaluated transition question of the SF-36, which measures the patient's point of view, has shown promising results.Further studies are required to refine the measure and complete the evaluation of the measurement properties in various COPD populations (mild disease, aging, different races and sex) and settings (language, interview or self-administered).Another important issue still to be addressed is the meaning of the McGill COPD Q scores (ie, clinical interpretation) or clinically important difference; this is important if the tool is to be used to judge therapy effectiveness.

DISCLAIMER:
The abstract of this article was presented at the 2008 American Thoracic Society Conference, May 16 to 21, Toronto, Ontario.

Figure 1 )
Figure 1) Correlations of the total McGill COPD Questionnaire score with the total St George's Respiratory Questionnaire (SGRQ) score
. Convergent validation refers to the extent to which the new McGill COPD Q scores agree with the results of other instruments believed to be assessing the same attribute.Pearson product-moment correlations between baseline McGill COPD Q and SGRQ scores were calculated.It was hypothesized that the total McGill COPD Q scores would be highly correlated with the SGRQ total scores because they assess a similar construct.Divergent validation refers to the extent to which the new McGill COPD Q scores correlate with those of other instruments assessing a different attribute.Pearson correlations between baseline McGill COPD Q scores and SF-36 subscales were calculated.It was hypothesized that the new McGill COPD Q scores would correlate poorly with the SF-36 pain subscale because they assess different attributes.Responsiveness was assessed by measuring Cohen's effect size

Table 3 Nonresponse rate, floor and ceiling effect, internal consistency and test-retest reliability of the McGill COPD Questionnaire
*Items from the previously developed chronic obstructive pulmonary disease (COPD)-specific module; † Items from the 36-item Short-Form Health Survey; ‡ Cronbach's alpha; § Intraclass correlation coefficient