Item Selection for a New Health-Related Quality of Life Measure for Parkinson's Disease: The Preference-Based Parkinson's Disease Index (PB-PDI)

Background Parkinson's disease (PD) is a neurodegenerative condition, predominantly affecting older adults. Preference-based measures (PBMs) can be used to make decisions about the cost-utility of different treatments. There are currently no PBMs for health-related quality of life (HRQoL) for PD. A previous study identified important health domains for individuals with PD and developed an item pool from existing measures per domain. The current study aims to contribute to the development of a new disease-specific PBM of HRQoL for PD by reducing the current pool of items according to the preferences of individuals with PD. Methods Fifty-three participants completed a visual analogue scale (VAS) of self-perceived health, the prototype PBM measure, and an item importance rating. To reduce the item pool, the following were calculated: (1) inter-item correlations; (2) impact of each item based on item performance and importance rating; (3) directionality of response options by comparing the VAS scores against each item. Results Participants (male = 54.7%, age = 60.0 ± 10.2) had a median Hoehn and Yahr score of 2.5 (interquartile range = 1). Items supported for inclusion by this analysis were sleep, fatigue, tremor, mood, walking, memory, and dexterity. Items demonstrating a logical decrease in VAS score with each increasing severity level were sleep, memory, tremor, fatigue, and mood. Conclusion This PBM will be critical for informing decisions about the cost-utility of PD treatments, guiding the resource allocation within our healthcare system. Future research will include cognitive debriefing with individuals with PD to refine item response options.


Background
Parkinson's disease (PD) is a progressive neurological condition that primarily impacts motor function [1]. It is estimated that 9.4 million people have PD worldwide [2] and 100,000 people in Canada [3]. As there is no cure for PD, individuals must undergo life-long treatment including pharmacological, therapeutic, or surgical interventions [4], placing a fnancial burden on the healthcare system and the individual [4,5]. As such, the need for standardized tools to assess the cost-efectiveness of diferent interventions is paramount [6].
A common method for the economic evaluation of interventions is cost-utility analysis, where the incremental cost of an intervention is compared to its incremental health improvement expressed in quality adjusted life years (QALYs). Closely tied to QALYs are the quality of life (QoL) and health-related quality of life (HRQoL). Quality of life refers to an "individual" perception of their position in life in the context of the culture in which they live and in relation to their goals, expectations, standards, and concerns" [7], while HRQoL refers to the "aspects of self-perceived well-being that are related to or afected by the presence of disease or treatment" [8].
Preference-based measures (PBMs) are used to calculate QALYs and assess the cost-utility of interventions [9]. Tey attach a value to each domain of health [10] and have the advantage of producing a single number (anchored from 0 to 1) as the fnal score, balancing gains in one domain against losses in another [10]. Te EuroQOL 5-dimension 5-level (EQ-5D-5L) and Health Utilities Index Mark 3 (HUIIII) are PBMs that can be used for cost-utility analyses, however; these are generic measures and may not optimally refect the health concerns of people with PD [6]. Specifcally, the importance assigned to the various QoL areas is based on the public's opinions and not on those with PD, who may have very diferent priorities. Together, this may make generic PBMs unresponsive to change among individuals with PD [6,11]. Currently, no PBM of HRQoL exists for PD [11].
Previous studies have identifed the health domains that are important to individuals with PD, as well as twelve prototype items that are most refective of these domains (i.e., sleep, fatigue, tremor, mood, walking, memory, dexterity, urine control, concentration, speech, freezing, and swallowing) [12,13]. Te twelve prototype items were then refned in an item writing workshop consisting of a panel of experts [14]. Te next step in the development of a PBM is to select the most relevant items for inclusion in the measure. Preference-based measures are unique in that they rarely include more than eight items, with one item per dimension [15]. Tis is because one item per dimension can generate many health states, calculated as the number of response options to the power of the number of items. Furthermore, the number of health states is limited by how many states participants can address during preference elicitation tasks [15]. As such, the purpose of this study is to assess and select items for inclusion in a PBM for PD, contributing to the development of a new multidimensional disease-specifc PBM of HRQoL for people with PD: the Preference-Based Parkinson's Disease Index (PB-PDI).

Study Design.
Tis was a cross-sectional study with primary data collection.

Sample.
Participants were recruited from the Quebec Parkinson Network (QPN) registry (n = 1,041). Te research team contacted potential participants by telephone. If individuals were interested, a link to an online survey was provided via E-mail, which included the consent form and the study survey. Te online survey is available in both English and French. Te sampling strategy was purposeful to ensure that the participants contacted were diverse in terms of age, disease severity, and sex. Bilingual individuals (n = 472) were extracted from the registry and split by sex (males n = 330, females n = 142). Within each sex category, individuals were split into age groups using 10-year increments. Within each age group, persons with varying disease severity were contacted. Out of the seventy-four individuals who consented to be e-mailed the survey link, ffty-three individuals completed it (i.e., a 71.6% response rate).

Outcome Measures.
Te following outcome measures were administered through an online survey format using LimeSurvey software, and were available in English and French.

Demographic
Questionnaire. Te demographic questionnaire asked about participants' sex, age, language fuency, geographical location, education, marital status, living situation, employment status, household income, year of diagnosis, presence of chronic conditions, and if the survey was completed independently or with help.

Self-Assessment Parkinson's Disease Disability Scale (SPDDS).
Te SPDDS is a disease-specifc questionnaire consisting of 24 items asking about the individual performance of activities in daily living [16]. Participants are asked to rate the degree of difculty they have performing each of the identifed activities in general, without the use of a mobility aid. Each item can be answered on a 5-point Likert scale from 1 � "able to do alone and without difculty" to 5 � "unable to do at all." Te total score ranges from 24 to 120, with higher scores suggesting a higher severity of impairment.

Preference-Based Parkinson's Disease Index (PB-PDI)
and Importance Rating. Te development of the PB-PDI is described in the following section. Tere was a total of twelve items in the PB-PDI, which asked about participants' sleep, tremors, memory, urine control, mood, fatigue, swallowing, walking, concentration, speech, dexterity, and freezing due to PD during the last two-week period (Supplementary Table 1). Participants answered on an ordinal response option scale ranging from 1 to 3, with 3 being the worst health state. Following each PB-PDI item, participants were asked to rate how important each item was to their QoL on a 5-point Likert scale from 1 � unimportant to 5 � extremely important.

Visual Analogue Scale (VAS) of Self-Perceived Health.
Participants were asked to evaluate their general health state by using a visual analogue scale (VAS) from 0 � "worst imaginable health state" to 100 � "best imaginable health state."

Te Hoehn and Yahr Scale (H&Y).
Tis scale is used to categorize the severity of PD from I (mild symptoms on one side of the body) to V (full helplessness). It has a moderate to signifcant inter-rater reliability, with nonweighted and weighted kappa scores ranging between 0.44 and 0.71 [17]. Further psychometric testing of the H&Y scale is limited [17]. Participants' H&Y stages were extracted from the QPN registry and only used during the purposeful sampling.

Development of the PB-PDI
2.4.1. Item Generation. Item generation was not within the scope of this paper; the prototype items were already been developed before the start of this project. In previous work, people with PD were asked to identify the most important domains of their lives that were afected by PD [12]. Next, items relevant to each domain were collected from existing measures to form an item pool. One item per domain was selected using Rasch analysis [13], and an item writing workshop was conducted to further refne and translate items into French [14].

Item Selection.
Item selection for the PB-PDI was the purpose of the current study. Items were selected for inclusion in the PB-PDI based on the following statistical and clinical criteria: (1) A high cross-product of importance and prevalence (i.e., impact score) (2) Inter-item correlations below 0.7 to reduce redundancy [18] and (3) Evidence from the literature that the health domain is responsive to treatment or is a side-efect of it

Ordering of Response Options.
Participants rated their level of functioning on the ordinal response options of the PB-PDI, and scored on a range of 1 to 3, with 3 being the worst health state. To assess the logical ordering of the response options with respect to the health rating, the score of each item was compared to the VAS. Te logical ordering of these response options would be indicated by the VAS scores showing decreasing values as response option severity increases from 1 (least severe) to 3 (most severe).

Statistical Analysis.
Descriptive summary statistics were presented as means and standard deviations (SD) for continuous variables. Medians and interquartile ranges (IQR), or percentages where applicable were presented for categorical data. Te distribution of the sample over each response level (i.e., levels 1 ̶ 3) on the PB-PDI and the distribution of responses over importance rating levels (i.e., not important to extremely important) were calculated as percentages.
Te mean impact score of each item on the PB-PDI was calculated as the product of the prevalence of participants who were afected by each health domain (i.e., those scoring a level 2 or higher on the PB-PDI) and the mean importance rating for that item [19]: impact = (% scoring a level 2+) * (mean importance rating for level 2+)/100. Similar exploratory analyses were conducted by disease duration (i.e., >5 years and ≤5 years since diagnosis) [20] to confrm the selection of items. Next, inter-item correlations were calculated using polychoric correlations. Lastly, the directionality of response options was evaluated by calculating, for each response option, the mean (SD) values on the VAS. All statistical tests were conducted using STATA 16.0 (STATACorp LLC, College Station, Texas).

Sample Size.
To conduct correlation analyses (α � 0.05, β � 0.2, and an expected r � 0.7 between the items [i.e., the correlation cut-of to reduce redundancy]), the required sample size would be n � 13. To conduct impact score analyses, the parameter of interest is the proportion of people with PD who rate items as very important or higher. With 50 participants, the 95% confdence interval around proportions of 50% would be ±13%. As such, we aimed to recruit a minimum of 50 participants. Table 1 outlines the characteristics of the entire sample (n � 53). Participants were predominantly male (54.7%), and the mean age of the sample was 60.0 years (SD � 10.2), ranging from 36 to 79 years. Te majority of the sample had a postsecondary education (92.5%) and a chronic condition in addition to PD (62.3%). Te sample had a mean SPDDS score of 36.7 (SD � 11.8), a mean VAS score of 66.6 (SD � 21.6), a median H&Y score of 2.5 (IQR � 1), and a mean of 8.3 years (SD � 6.1) since their PD diagnosis.

Results
Participants answered at all item response levels (Supplementary Table 1). In terms of the distribution of responses over the importance rating levels, most participants rated all items as very important (response level 4) or extremely important (response level 5) to their overall QoL (Supplementary Table 2). Eighty-seven percent of the sample identifed walking as very or extremely important, followed by sleep (83.0%), memory (79.2%), and fatigue (79.2%). Table 2 presents the mean impact score for the whole sample. Items with the highest prevalence were sleep (81.1%), tremor (79.2%), fatigue (77.4%), dexterity (64.2%), mood (62.3%), and memory (60.4%). Moreover, the impact score was calculated for each item and used to rank them from the highest to the lowest. Te top six items with the highest impact score were sleep (3.5), fatigue (3.2), tremor (3.2), mood (2.6), walking (2.4), and memory (2.4). Table 3 reports the exploratory subgroup analysis by disease duration. Te exploratory analysis supported the same top six items of sleep, fatigue, tremor, mood, walking, and memory for males and for those with >5 years since their PD diagnosis. For those with ≤5 years since their PD diagnosis, dexterity was ranked slightly higher than walking. Items that consistently demonstrated the lowest impact score for the whole sample and by most subgroups were swallowing, freezing, speech, and concentration. Tese items qualifed for exclusion from the PB-PDI.
Te polychoric correlation matrix for each item on the PB-PDI showed that concentration correlated highly with both memory and fatigue (r = 0.7), indicating redundancy Neurology Research International between these items (Supplementary Table 3). As a result, the item on concentration qualified for exclusion from the PB-PDI because it was highly correlated with two other items and had the lower impact score.
For the last item inclusion criterion, systematic reviews specifc to the PD population were consulted when available. Te criterion was fulflled by all the items, as the literature suggested that all selected health domains were responsive to treatment, whether it is therapeutic, pharmacological, or surgical [21][22][23][24][25][26][27][28][29][30]. Table 4 presents the mean VAS rating for each item's response options. Te items that demonstrated a logical decrease in VAS score with each increasing severity level were sleep, memory, tremors, freezing, fatigue, concentration, and mood. Items that did not demonstrate a logical decrease in VAS score with each increasing severity level were walking, dexterity, urine control, swallowing, and speech.

Discussion
Te overall purpose of this study was to select and assess items for inclusion in a new multidimensional diseasespecifc PBM of HRQoL for people with PD: the PB-PDI. Previously, our research group identifed the health domains that are important to individuals with PD [12], determined prototype items to capture these domains [13], and refned the prototype items based on the opinions of an expert panel [14]. Te result was twelve items proposed for consideration in the new PBM. Te current study identifed seven items for inclusion in the fnal PB-PDI, which were sleep, fatigue, tremor, mood, walking, memory, and dexterity.
Te frst criterion for item selection was a high ranking of items based on impact scores, as calculated by the product of item prevalence and importance rating. Te impact method was chosen as it is an established way of item reduction [19]. Relative to psychometric methods such as factor analysis, the impact method ensures that health domains that are both important and prevalent among patients are represented in items which could have otherwise been excluded [19]. A majority of participants rated all items as very important or extremely important, and the prevalence of each item ranged from 22.6% to 81.1%. Tis resulted in the highest impact scores for sleep, fatigue, tremor, mood, walking, and memory. Items that demonstrated the lowest impact scores, both for the whole  4 Neurology Research International sample and by subgroup, were swallowing, freezing, speech, and concentration. Te second criterion for item selection was removing items with inter-item correlations above 0.7. Tis criterion, commonly used together with the impact method [31], was chosen to ensure that there is no redundancy between the items, assisting with further item reduction. Concentration correlated highly with memory and fatigue; therefore, it was excluded given that it had a lower impact score.
Lastly, items had to have evidence that the health domain was responsive to treatment or was a side-efect of it. All domains were evidenced to be responsive to pharmacological, therapeutic, or surgical treatment or a combination [21][22][23][24][25][26][27][28][29][30]. For example, sleep issues can be managed pharmacologically using Modafnil [22] or therapeutically using exercise [23]. Similarly, walking can be managed pharmacologically using dopamine agonists [25] and therapeutically with exercise [24]. Health domains were included if they were also potential side efects of treatments, such as fatigue [32].
In descriptively assessing the directionality of response options for the seven selected items, the items demonstrating a logical decrease in VAS score with each increasing severity level were sleep, memory, tremor, fatigue, and mood. Of those items, tremor and mood demonstrated near equal spacing between the response options (i.e., potential linearity). Items that have been chosen for inclusion in the PB-PDI and did not demonstrate appropriate directionality of response options will undergo cognitive debriefng with individuals with PD. Tese items were walking and dexterity.
Te current study had several strengths and limitations. First, using the QPN participant registry provided access to a comprehensive list of participants. However, participants' information in this registry, such as the H and Y scores, could have been outdated. As such, disease duration was used instead of H&Y scores for the subgroup analysis as this data was recently collected. Furthermore, information regarding the racial makeup of the sample was not collected. Second, the current study methodology employed an online survey. A strength of this survey is that it was ofered in both English and French to accommodate for a larger demographic. However, the online format could have discouraged participation and limited responses to participants from a specifc socioeconomic status. Lastly, since participants rated all items very highly, more weight was given to the prevalence of health issues in the calculation of the impact scores. Although the sample was representative with regards to age and sex, it may not have been representative of the full disease spectrum of PD; the sample was highly functioning as evidenced by the health descriptors in Table 1. As such, impact scores are representative of a primarily high-functioning group of individuals with PD. Tis could have led to the exclusion of items that are more relevant at later stages of the disease (e.g., swallowing), limiting the usage of the PB-PDI for HRQoL assessments in communitydwelling, higher-functioning individuals with PD. Lastly, the subgroup analysis was not sufciently powered; however, it allowed for the exploration of items that could have been missed in our overall analysis.
Overall, this study has important implications for addressing the critical need for a disease-specifc HRQoL PBM for PD [11]. Generic PBMs of HRQoL have been criticized for not optimally refecting the health concerns of  people with PD [6]. For example, commonly used generic PBMs such as the EQ-5D-5L, HUI II, HUI III, and Short Form 6-Dimensions (SF-6D) are missing items that were nominated as important by people with PD [12]. Consequently, these generic measures may not be as responsive to changes in the PD population [6]. However, the proposed PB-PDI is advantageous as it includes PD-specifc items that are absent from generic PBMs, such as sleep, fatigue, tremor, and dexterity. Preference-based measures are becoming increasingly popular given their advantages in assessing the cost-utility of interventions through the calculation of QALYs [9]. QALYs are advantageous in that they combine the quality and quantity of life into a single number which can be compared across interventions. Given that PD involves a variety of lifelong management strategies [4], it is important for researchers and policymakers to make decisions about how to allocate resources accordingly. As generic PBMs do not capture the health domains that are most important to individuals with PD, it is essential to develop a diseasespecifc PBM to be used in the cost-utility assessments of PD interventions. Te current study contributed to the development of such a measure, and the proposed PB-PDI can be used by researchers and policymakers to make decisions about appropriate treatment plans, monitor their efectiveness [9], and evaluate new and emerging treatment options in the PD population.
In conclusion, the results of this study supported the inclusion of seven items in the PB-PDI. Te next steps in the development of the PB-PDI include (1) cognitive debriefng with individuals with PD to further refne the selected items; (2) elicitation of preference weights and development of a scoring algorithm; (3) evaluation of the PB-PDI's construct validity.

Data Availability
All relevant data are presented in the manuscript and supplementary materials.

Additional Points
(i) Preference-based measures (PBMs) can assess different treatments' cost-utility; however, currently no PBM exists for Parkinson's disease (PD). (ii) We assessed items for inclusion in a PBM using impact and correlation analyses, and items on sleep, fatigue, tremor, mood, walking, memory, and dexterity were included. (iii) Tis PBM includes items specifc to PD that are missing from generic measures.

Ethical Approval
Hamilton Integrated Research Ethics Board (#12802) approval has been obtained for this project.

Consent
Informed consent was obtained from all participants prior to initiating the study.

Conflicts of Interest
Te authors declare that they have no conficts of interest.

Authors' Contributions
AK conceptualized the study. SM, LZ, NM, AK, MB, and JR performed methodology. SM, NM, and AK performed analysis. SM wrote the original draft. SM, AK, MB, JR, and NM reviewed and edited the manuscript. AK provided funding acquisition and supervised the study. All authors have read and approved the fnal manuscript for submission.