Psychometric Evaluation of the Parkinson's Disease Activities of Daily Living Scale

Objective To evaluate a set of psychometric properties (i.e., data completeness, targeting, and external construct validity) of the Parkinson's disease Activities of Daily Living Scale (PADLS) in people with Parkinson's disease (PD). Specific attention was paid to the association between PADLS and PD severity, according to the Hoehn & Yahr (H&Y) staging. Methods The sample included 251 persons with PD (mean age 70 [SD 9] years). Data collection comprised a self-administered postal survey, structured interviews, and clinical assessments at home visits. Results Data completeness was 99.6% and the mean PADLS score was 2.1. Floor and ceiling effects were 22% and 2%, respectively. PADLS scores were more strongly associated (rs > 0.5) with perceived functional independence, ADL dependency, walking difficulties, and self-rated PD severity than with variables such as PD duration and cognitive function (rs < 0.5). PADLS scores differed across H&Y stages (Kruskal-Wallis test, p < 0.001). Those in H&Y stages IV-V had more ADL disability than those in stage III (Mann–Whitney U test, p < 0.001), whereas there were no significant differences between the other stages. Conclusion PADLS revealed excellent data completeness, acceptable targeting, and external construct validity. It seems to be well suited as a rough estimate of ADL disability in people with PD.


Introduction
The ability to perform activities of daily living (ADL) is essential for independent living. ADL includes activities such as feeding, dressing, bathing, cooking, cleaning, and shopping. People with Parkinson's disease (PD) often experience ADL limitations already early during the disease course [1]. Poor ADL performance may result in dependence and is negatively associated with health-related quality of life [2]. Thus, adequate assessments of ADL are important to be able to monitor ADL performance throughout the course of disease in order to provide optimal treatment, care, and rehabilitation for people with PD.
The Parkinson's disease Activities of Daily Living Scale (PADLS) is a single-item self-reported rating scale targeting ADL in people with PD [10]. Since its development in 2001, the PADLS has been used in various PD studies; see, for example, [11][12][13][14][15]. To the best of our knowledge, its psychometric properties have only been reported in the original publication [10] and in a conference abstract [16]. These studies [10,16] reported test-retest reliability (weighted Kappa 0.70; r 0.89) and external construct validity in terms of associations with, for example, motor symptoms (r 0.65; 0.55), complications of PD therapy ( 0.56), depressive symptoms (r 0.43), PD duration (r 0.39; 0.32), and frequency of social activities (r 0.02-0.44). A recent review listed PADLS as a "suggested" rating scale for assessment of ADL in people with PD, and it was argued that more psychometric studies are needed before PADLS can be classified as "recommended" [3]. The review also noted a lack of studies regarding the association between PADLS and the Hoehn & Yahr staging (H&Y, i.e., a classification of PD severity [17]) and the Unified Parkinson's Disease Rating Scale (UPDRS, which assesses PD signs and symptoms [18]).
Thus, this study aimed to evaluate a set of psychometric properties (i.e., data completeness, targeting, and external construct validity) of PADLS scores in people with PD. Specific attention was paid to the association between PADLS and PD severity according to H&Y.

Materials and Methods
We utilized cross-sectional baseline data of the project "Home and Health in People Ageing with PD." Details regarding the project design and methods have been published elsewhere [19]. The project was conducted in accordance with the Helsinki Declaration and was approved by the Regional Ethical Review Board in Lund, Sweden (number 2012/558). All participants gave their written informed consent.

Participants and Recruitment.
Participants were recruited from three hospitals in Skåne County, Sweden. A detailed flow chart of the recruitment procedure has been published [20]. A total of 653 persons met the inclusion criterion of a PD diagnosis (ICD-10: G20.9) since at least one year. Of those, 158 were excluded due to difficulties in understanding or speaking Swedish ( = 10), severe cognitive difficulties ( = 91), or other reasons that made them unable to give informed consent or take part in the majority of the data collection (e.g., hallucinations or a recent stroke, = 57). Fifty-eight persons were excluded since they lived outside Skåne County. The remaining 437 persons were invited to participate, but 22 of those were unreachable and two had a revised diagnosis. Out of the remaining 413 participants, 157 (38%) declined. Another five persons were excluded since they had not responded to the PADLS by themselves or not responded within two months from the home visit (part of the data collection) or due to extensive missing data. This resulted in a final study sample of 251 participants (mean age 70 [SD 9] years; 39% women). Further participant characteristics are presented in Table 1.

Data Collection.
The data collection is comprised of a self-administered postal survey and a subsequent home visit, which included interview-administered questionnaires and questions as well as clinical assessments. The home visits were conducted by two registered occupational therapists who had undergone project-specific training. More details regarding the procedure have been described elsewhere [19].
The self-administered postal survey included the PADLS, a single-item self-reported rating scale that addresses perceived ADL difficulties and dependence on others as well as on assistive devices during various ADL [10]. Respondents are instructed to rate how their PD has affected their dayto-day activities during the past month according to five response categories ranging from 1 (no difficulties with dayto-day activities) to 5 (extreme difficulties with day-to-day activities), but each response option also has a more detailed description. For example, "2: mild difficulties" includes the following description: "Slowness with some aspects of housework, gardening or shopping. Able to dress and manage personal hygiene completely independently but rate is slower." PADLS scores have also been dichotomized into "not needing help from others in daily activities" versus "needing help" (PADLS 1-2 versus 3-5) [21].
The structured interview during the home visit included a question on PD duration and three study-specific questions targeting social activities. The latter were self-rated by asking about the frequency of visiting/receiving visits from friends/family (almost never; once or twice a year, once or twice a month, once or twice a week, or every day; scored 1-5, resp.). Moreover, depressive symptoms were selfrated using the Geriatric Depression Scale (GDS-15; possible total score 0-15, higher = worse) [24]. Perceived functional independence was assessed according to an item from the Neuropsychological Aging Inventory (possible item score 0-10, higher = better) [25]. Finally, ADL dependency was assessed using the ADL Staircase [26]. Based on the internationally well-known and widely used Katz' ADL Index [27], the ADL Staircase is a conceptually and theoretically sound instrument supported by research demonstrating reliability and validity [28,29] as well as methodological considerations for use in different populations [30,31]. Dependence in nine ADL items is rated based on a combination of interview and observation (possible total score 0-9, higher = worse) [26].
In addition, the participants rated their overall PD severity as either mild, moderate, or severe (scored 1-3, resp.). Assessments at the home visits were conducted at a time point when the participant reported feeling at their best.

Data Analyses.
Statistical analyses were performed in IBM SPSS Statistics, version 24. Analyses included data completeness, targeting, and external construct validity of the PADLS. Two-tailed values were used and the level of statistical significance was set to < 0.05.

Data Completeness.
Data completeness refers to the degree to which a rating scale is completed [33,34] and was calculated as the percentage of participants who responded to the PADLS. A maximum of 10% missing data has been suggested as a limit for acceptable data completeness [35].

Targeting.
Targeting refers to the scale's ability to mirror the levels of the targeted variable (e.g., ADL disability) in the study sample [33]. The mean score of a well-targeted rating scale should be close to the scale's midpoint and scores should range the full span of possible scale scores [34]. Skewness should be less than ±1 [34] and floor and ceiling effects should not exceed 15-20% [33,34,36]. That is, less than 15-20% of the study sample should score 1 or 5 on the PADLS, respectively.

External Construct Validity.
External construct validity of a rating scale is supported when scores are more strongly associated with related constructs and more weakly associated with nonrelated constructs [37]. In this study, associations between PADLS and other scores were explored by Spearman's correlation coefficients ( ). The hypotheses were based on clinical reasoning and previous studies regarding associations between ADL and other variables [4][5][6][7][8][9][10]16]. The associations ( ) between PADLS and walking difficulties in daily life, perceived functional independence, selfrated PD severity, ADL dependency, motor symptoms, and complications of PD therapy were anticipated to be >0.5 [4-7, 9, 10, 16]. The associations between PADLS and depressive symptoms as well as general health were anticipated to be around 0.5 [5,7,8,10]. The associations between PADLS and PD duration, age, cognitive function, and frequency of social activities were anticipated to be <0.5 [5,7,8,10,16]. Kruskal-Wallis and Mann-Whitney tests were used to explore whether PADLS scores differed between H&Y stages. H&Y stages IV and V were merged due to few participants in H&Y stage V ( = 6).

Results
All but one participant responded to the PADLS, resulting in 99.6% data completeness. The score distribution is presented in Table 2. The mean score was 2.1 and scale scores ranged the full span (i.e., 1-5). Fifty-four participants chose the lowest ("best") response option (22% floor effect) whereas six participants chose the highest ("worst") response option (2% ceiling effect).
PADLS scores correlated >0.5 with walking difficulties in daily life, perceived functional independence, self-rated PD severity, and ADL dependency. The associations between PADLS scores and other studied variables were weaker ( Table 3). The Kruskal-Wallis test showed that PADLS scores differed across H&Y stages ( < 0.001). Specifically, those in H&Y stages IV and V had higher PADLS scores than those in H&Y stage III (Mann-Whitney test, < 0.001), whereas there were no significant differences between the other H&Y stages (Table 4).

Discussion
This study confirmed that the PADLS has satisfactory psychometric properties for use in the PD population. The PADLS revealed excellent data completeness; only one participant left the form blank. This indicates that the scale is easy to understand and perceived as relevant [38] by people with PD. The finding probably reflects the single-item nature of the scale, which might favor data completeness. Targeting was generally acceptable, with small ceiling effects. However, floor effects were above the recommended level [33,34,36]. Small floor and ceiling effects are desirable in order to enable separation of people and detect changes [33]. More than onefifth scored the lowest possible and reported no difficulties with day-to-day activities. This floor effect may purely mirror the PD severity of the present sample. That is, 85 participants self-rated their PD as mild and 50 were classified as H&Y stage I. With such a high prevalence of mild PD severity it is not surprising that also ADL disabilities were rated as low or nonexisting. On the other hand, a previous study [1] reported that restrictions in ADL are often seen early during the disease. Already at their first visit to a neurological centre, those later diagnosed with PD had more ADL disabilities than healthy age-matched controls [1]. Although the PADLS showed generally satisfactory psychometric properties, its single-item nature makes it a coarse indicator unsuitable as an outcome measure due to the uncertainty associated with such scores. Notably, floor effects indicate that it is especially important to complement this scale with a more detailed ADL assessment for people with mild PD-symptoms. External construct validity of the PADLS was generally supported by largely expected associations with other variables. However, the associations with motor symptoms and complications of PD therapy were lower than expected. In comparison to the present sample, the participant characteristics in previous studies [5,6,9,10,16] show no clear patterns that could explain our relatively weak association between, for example, ADL disability and motor symptoms. However, previous studies have shown large variations in association between motor symptoms and ADL disability, and our findings are within the range of previous studies [5-7, 9, 10, 16].
Indeed, varying results in previous studies imply a challenge when stating a priori hypotheses, which is essential for the exploration of a scale's external construct validity [37]. We do find our hypothesis of the association between ADL disabilities and motor symptoms reasonable, as several items in UPDRS part III (i.e., motor symptoms) do capture disabilities. It should be kept in mind that ADL is a complex phenomenon, affected by environmental characteristics and prerequisites as well as by the use of various assistive devices. Moreover, existing ADL assessments cover different activities, and the majority do not take individual or subgroup specific activity preferences and patterns into consideration. All considered, ADL rating scales take different aspects into account and include different activities, making the definition of a priori hypotheses a delicate matter.
Although PADLS scores increased with increasing H&Y stages, the differences across these stages were small. The finding that PADLS scores differed only between H&Y stage III versus stages IV and V is not surprising. That is, the definition of H&Y stage III states that "patients are still physically capable of leading independent lives" whereas stages IV-V define a "severely disabling" PD [17].

4.1.
Limitations. Data completeness might have been affected by the study design. That is, our data collectors were instructed to screen all self-administered ratings at the subsequent home visit. In case of missing values, the data collectors were instructed to ask the participants to add responses. However, another psychometric study based on the same data collection did report missing values that were close to those collected in another sample not using the same procedure [39].
One could argue that using polychoric and polyserial correlation coefficients would have been a theoretically better choice than when studying the external construct validity of the PADLS. This is since and tend to attenuate estimated correlations between ordinal data [40,41], such as the PADLS. Since previous studies did not use polychoric or polyserial correlations [4][5][6][7][8][9][10]16], we used methods similar to those used before in order to enhance comparability with previous studies. However, reanalyses using polychoric correlations yielded generally somewhat stronger coefficients, as expected (data available on request). In addition, this study does not cover all psychometric aspects. For example, test-retest reliability was not evaluated.
Our decision to use the ADL Staircase [26] and not the UPDRS part II [18] to study external construct validity in terms of ADL deserves a comment since the latter is commonly used in PD research. Given the complexity of the phenomenon at target (i.e., ADL), our ambition was to use data collected with an ADL rating scale based on conceptual and theoretical underpinnings. As such, ADL is not disease specific but a generic human phenomenon. Although UPDRS part II is recommended as a disability instrument [3], it contains items that are not conceptually related to the ADL construct [42]. Recently the Movement Disorder Society task force documented this as a major drawback as "the 13 items of the UPDRS-ADL do not all assess disability,  with 7 of 13 assessing impairments, not functional status (speech, salivation, swallowing, falling, freezing, tremor, and sensory complaints)" [3]. While we consider choosing the ADL Staircase a methodological strength, it should be kept in mind that the comparability with PD studies that used UPDRS part II is limited. It needs to be emphasized that the PADLS is a single-item rating scale and as such, it only gives a rough, global estimate of a person's ADL disabilities. Still, it should be noted that although the abbreviated response categories are phrased, for example, "mild difficulties in day-to-day activities," the PADLS captures both perceived difficulties, dependence on others and on assistive devices during ADL performance. Moreover, it includes both personal and instrumental ADL. As such, the PADLS can be useful in clinical practice as well as in research for purposes of providing a crude categorization of levels of ADL disabilities. However, a more comprehensive rating scale is needed if a more thorough ADL assessment is warranted, and especially so for those with less severe ADL disabilities. This is in agreement with recommendations from the developers of the PADLS, who stated that the scale is not suitable for use in isolation but should be considered a complement to existing scales [10].

Conclusions
We found the self-reported, single-item rating scale PADLS to yield scores with excellent data completeness, acceptable targeting, and external construct validity. The PADLS seems to be well suited for providing a rough indicator of ADL disability in people with PD. As psychometric testing is a continuous process, further studies focusing on additional aspects, such as responsiveness, minimal important difference, and response category functioning, are warranted.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.