Assessment of the Psychometric Properties of the Danish VISA-P

Purpose The objective of the current study was to conduct a rigorous assessment of the psychometric properties of the Victorian Institute of Sports Assessment-patellar tendinopathy (VISA-P). Methods Rasch analysis, confirmatory factor analysis (CFA), and multivariable linear regression were used to assess the psychometric properties of the VISA-P questionnaire in 184 Danish patients with patellar tendinopathy who had symptoms ranging from under 3 months to over 1 year. A group of 100 healthy Danish persons was included as a reference for known-group validation. Results The analyses revealed that the 8-item VISA-P did not fit a unidimensional model, yielded at best a 3-factor model, and exhibited differential item functioning (DIF) across healthy subjects versus people with patellar tendinopathy. Conclusion VISA-P in its present form does not satisfy a measurement model and is not a robust scale for measuring patellar tendinopathy. A new PROM for patellar tendinopathy should be developed and appropriately validated, and meanwhile, simple pain scoring (e.g., numeric rating scales) and functional tests are suggested as more appropriate outcome measures for studies of patellar tendinopathy.


Introduction
Patellar tendinopathy is a common injury that aficts both elite and recreational athletes [1,2] with an estimated prevalence as high as 45% in some sports [1].Terefore, current eforts in ongoing research examine how to provide optimal treatment.Te Victorian Institute of Sports Assessment-patellar tendinopathy questionnaire (VISA-P) [3] is the most widely used and the only condition-specifc patient-reported outcome measure (PROM) for studies of patellar tendinopathy [4,5].A recently published consensus statement from the International Scientifc Tendinopathy Symposium (ICON 2019) recommends the use of conditionspecifc PROMs such as VISA [6].Moreover, PROMs are increasingly used as confrmatory outcomes in clinical trials and even prioritized over clinical and functional measures in the statistical testing hierarchy [7,8].Hence, to arrive at sound conclusions in clinical trials, a fundamental requirement is to confrm that the PROM exhibits adequate measurement properties for the targeted patient population and the specifc trial conditions.
PROMs are important because a person's perception of a pathology, rather than the pathology itself, can be assessed [9].Data derived from PROMs involve assigning numerical values to the response options to PROM questions (items).Te (weighted or unweighted) scores from the individual items are then summed to an overall (total) score that, assuming unidimensionality, provides a proxy measure of the construct of interest [10].Tus, in the case of the VISA-P, the lower the total score, the greater the perceived level of functional disability.Whether the PROM actually measures what it claims to measure depends on (1) the relevance and coverage of the items for the patient group being assessed (content validity) in relation to the construct of interest [11] and (2) whether the total response scores to those items satisfy the basic criteria of measurement (construct validity) [12][13][14][15].When PROM data are congruent with (i.e., "fts") statistical measurement models such as item response theory (IRT) models or confrmatory factor analysis (CFA) models, it follows that the PROM possesses adequate psychometric properties [16][17][18].
Te VISA-P as a legacy PROMs was developed over 20 years ago.It has been reported to be a valid and reliable outcome measure for patients with patellar tendinopathy [3,19].However, these analyses were conducted only using classical test theory (CTT) methods, which today are considered insufcient to reach such a conclusion [20].Since the original publication [3], VISA-P has been translated into several languages, including Danish, and has been reported to have acceptable cross-cultural validity [7,21].However, a recent thorough analysis found the content validity to be very poor [4].Te VISA-P was developed without input from patients with respect to the item generation and item reduction process [3].Nor was there a description of the methodological background for PROMs development of the clinicians who developed the items or how and why they chose to include the eight items (and thus exclude other potential items).Tis process does not satisfy the general principles of establishing content validity, which requires face-to-face cognitive interviews with the targeted patients to confrm both the relevance, coverage, and comprehensibility of the items and response options [7,14].Furthermore, some items that possesses a mix of themes that addressing diferent domains and response options are problematic, and response option "yes" and "no" is likely a more appropriate response to some questions instead of the 11 possible response options on the 0-10 rating scale.Moreover, the measurement properties of VISA-P have never been assessed using robust statistical analytic methods [7].
Te VISA-P has been used as an outcome measure in numerous clinical trials.Terefore, the aim of this study was to conduct a rigorous analysis of the psychometric properties of the Danish version of VISA-P in a cohort of patients with patellar tendinopathy and healthy controls.

Materials and Methods
Te original version of the VISA-P questionnaire was published in 1998 [3].It consists of 8 items with a maximum score of 100 for an asymptomatic, fully performing individual, and lower score indicating more symptoms and limitation of function and activity.Item 1 concerns the ability to sit without pain and the patient is asked to register the number of minutes of pain-free sitting (from 0-100 minutes) on a 0-10 numeric rating scale.Items 2-6 use the same 0-10 numeric rating scales as Item I, with 11 response options instead of adjective response scales.Item 7 has a 4-category response option structure, which is transformed to a 0-10 rating scale.Te result is that items 1-7 can achieve up to 70 points, while item 8 has a maximum score of 30 points, and the VISA-P a maximum total score of 100.Te original validation of VISA-P included only 38 patients and the calculation of a Pearson correlation between the VISA-P scores and a pain rating on the Nirschl pain scale [22].
In the present study the psychometric properties of VISA-P were assessed by looking at whether there was evidence of ft to an appropriate measurement model with a specifc focus on evidence of unidimensionality and diferential item functioning (DIF).Unidimensionality infers that responses to items depend only on a single characteristic (e.g.pain), which means that all items contribute to a single score [20].DIF is a bias due to diferent response patterns in specifc items between subgroups, such as sex, age group, or injury chronicity [23][24][25][26].Evidence of DIF is detrimental to scale properties since it can mask real diferences or detect diferences between subgroups that are not attributable to real diferences [25].DIF is investigated by the use of models that assess the independence of a list of background variables on the items conditional on the summary VISA-P score.Unidimensionality is investigated by assessment of data ft to measurement models, such as confrmatory factor analysis (CFA) or item response theory (IRT) models [27,28].
Rasch analysis, multivariable linear regression, and CFA were used to analyses the psychometric properties of VISA-P.Te sample was a cohort of patients with patellar tendinopathy (symptom duration ranging from less than 3 months to more than a year) included in intervention studies at Bispebjerg Frederiksberg Hospital between 2006 and 2021, and a group of healthy persons.Te study participants were male and females, 18 to 55 years of age, with clinical signs of patellar tendinopathy confrmed by ultrasound imaging.All patients were sports-active individuals (recreational or elite) recruited through the sports clinic at Bispebjerg and Frederiksberg Hospital, various local sports clubs, and online advertisement.Data were accessed from previous or ongoing trials at our facility: two studies included participants with symptoms <3 months: Tran et al. [29], ClinicalTrials.govIdentifer: NCT03642392 (ongoing), four studies included participants with symptoms >3 months: Kongsgaard et al. [30], Agergaard et al. [31], Olesen et al. [32], ClinicalTrials.govIdentifer: NCT04550013 (ongoing), and one study included both <3 month and >3 months: ClinicalTrials.govIdentifer: NCT04144946 (ongoing).All original studies were approved by the regional ethics committees, and all person identifable data were removed prior to analyses.
Te healthy controls were sports-active individuals of both sexes, 18 to 57 years old, recruited through local sports clubs and the University Copenhagen School of Sport Science, and who self-reported had no prior symptoms in their 2 Translational Sports Medicine patellar tendons or previous knee surgery.Data were collected with no person identifable data and therefore the participants only gave oral consent and permission from the regional ethics committees were not required in accordance with the ethical and scientifc standards in Denmark.

Analysis Strategy
Multiple techniques were employed to assess the psychometric properties of VISA-P.First, ft to a Rasch unidimensional measurement model was assessed using Andersen's conditional likelihood ratio test (CLR).Overall ft was investigated through obtaining item-trait interaction chi-square values (a nonsignifcant chi-square indicates good ft) [33,34].Individual item ft was assessed by standardized individual itemperson ft residuals (i.e., the diference between observed and expected scores) to approximate a Z-Score, where values between ±2.5 indicates adequate ft to the model [33,34].As the item response structure of VISA-P consists of polytomous data, a partial credit polytomous model was applied.DIF was assessed using analysis of variance [35] for sex (male/female), age group (±30 yrs), BMI (±25), symptom duration (≤3 months, 4-12 months, ≥12 months, and no symptoms at all) [25,36].For age group DIF analyses, the cut-of of ±30 years of age was chosen because the median age of the sample group was 28 years.Tis allowed for dichotomization of younger versus older persons for comparison of scoring patterns across the groups.For BMI, the value of ±25 was chosen because this generally corresponds to the BMI cut-of value for being "overweight" [37].Te duration of symptoms groups were chosen to allow for a comparison of scoring patterns across groups with acute (<3 months), chronic (>3 months), and protracted chronic tendinopathy (>12 months).Te Rasch analysis for VISA-P was conducted analogously to a previous study of VISA for Achilles tendinopathy [38].CFA was also used to assess four separate factor structures of the VISA-P: the original unidimensional structure, a reduced unidimensional scale including only item 2-6 (based on the results from the Rasch analysis), a 2factor structure (items 1-5 and 6-8), and a 3-factor structure (items 1-3, 4-6, and 7-8).CFA model ft was assessed with the goodness of ft index (GFI) > 0.95; root mean square error of approximation (RMSEA) < 0.06; standardized root mean square residual (SRMR) < 0.06; and the comparative ft index (CFI) > 0.95 [39,40].
Lastly, DIF was assessed for the same person characteristics as for the Rasch analyses (Sex, Age, BMI, and symptom duration) in multivariable regression analyses.Tese analyses were carried out for all subjects and repeated after removing the healthy subjects.
RUMM 2030 was used for the Rasch analysis [36].CFA and regression analyses were carried out with SAS v9.4.SPSS AMOS v28 was also used for CFA and descriptive statistics.

Results
A total of 184 patients with patellar tendinopathy and 100 healthy participants were included in the present analysis.Table 1 shows the characteristics of people in the sample, the variables used for the DIF analyses, and the VISA-P total scores across the subgroups.

Rasch Analysis.
A ceiling efect was observed for certain items in VISA-P, notably for items 1 and 3. Figure 1 shows the frequency distribution of response scores for items 1 and 3, which reveals a ceiling efect at the item level for patients with patellar tendinopathy, with over half the patients in the cohort essentially unable to improve in status on both these items.

Confrmatory Factor Analysis (CFA).
Consistent with the Rasch results, the CFA rejected a unidimensional scale and indicated that a 3-factor structure with items 1-3, 4-6, and 7-8 in separate dimensions was more viable than the single factor 8-item scale (albeit not entirely convincing).Te most robust CFA model confrmed the results from the Rasch analysis that showed that a reduced scale consisting of items 2-6 yielded the only plausible unidimensional scale for VISA-P.Table 3 shows the CFA ft indices and Cronbach's alpha (α) for the diferent suggested scales.
For CFA and multivariable analyses, a cohort of healthy controls was included for assessment of known-groups validity and scaling properties across the spectrum of disease duration.Te healthy control participants exhibited Translational Sports Medicine even greater ceiling responses at the item level than the symptomatic participants, and were only included in the CFA and multivariable analyses.

Multivariable Regression. Multivariable regression
analyses showed considerable DIF in the original scale for all subjects (including both injured and healthy people), but the observed DIF was driven by the healthy subjects.
Table 4 shows the full DIF results.When the healthy subjects were removed from the analysis, DIF only remained for body mass index (BMI) in Item 8 (the asterisks in Table 4), which reveals that in the original 8-item form, VISA-P is not an adequate scale for comparison of healthy people with people with tendinopathy, or, importantly, for monitoring people with tendinopathy trying to become healthy.Translational Sports Medicine

Discussion
Te main fnding of the present study was that stringent psychometric assessment of the Danish version revealed that VISA-P does not satisfy a measurement model, lacks unidimensionality, and exhibits considerable DIF, which was driven by the healthy subjects.Furthermore, a reduced 5item unidimensional VISA-P scale was supported by the psychometric assessment.Consequently, the use of VISA-P results can be misleading, as response to diferent treatments may be overlooked, or in worst case might lead to wrong conclusion in clinical trials.
Te original validation of VISA-P included only 38 patients and the calculation of a Pearson correlation between the VISA-P scores and a pain rating on the Nirschl pain scale [22], but no assessment of the instrument's psychometric properties was undertaken.Te present study included rigorous analyses of the psychometric properties in a broader sample of patients and healthy persons and revealed multiple problems with the VISA-P.Most importantly, it showed a lack of unidimensionality.Hence, when used as a single score in the original 8-item form, VISA-P is not an adequate scale for measuring self-reported impact of patellar tendinopathy.
Technical information regarding dimensionality was not provided in the original study [3].Only one study has examined the factor structure using CFA in a sample of Spanish athletes [41], and they concluded a relatively acceptable ft of the one factor solution, although the study only assessed measurement invariance across sex, but not symptom duration, or response in a healthy population.Furthermore, they suggest that items 7 and 8 may not measure the same construct as the remaining 6 items.Te present Rasch analysis shows substantial misft for items 1, 7, and 8, which supports the viability of a reduced one-dimensional scale using only items 2-6, rather than a two-factor model.However, this might come at the price of incomplete content coverage, since physical activity and sports participation would be excluded.On the other hand, it would remove the heavily weighted item 7 and 8 from the total score, which results in an irrelevant score for athletes that continue training with symptoms or noncompetitive athletes without pain.
Te VISA is not considered a diagnostic tool [3], however, there is a consensus among experts that the VISA can distinguish among healthy persons and those with tendinopathy, and that it is a good measure of symptom severity [42].Importantly, the present evaluation demonstrated a violation of measurement invariance (DIF) across injured and healthy people, which renders the VISA-P an inadequate scale for comparison of healthy versus injured people, or to monitor injured people trying to recover in intervention studies.It would be more trustworthy to use simple pain scores or clinical tests (until a new PROM has been developed), allthough we do acknowledge that such measures cannot make up for a rigourosly developed PROM.
A recent COSMIN checklist review [43] found very lowquality evidence for the content validity for VISA-P.Furthermore, there is nothing to indicate that VISA-P was based on a theoretical reference model or patient involvement [3], which is necessary to confrm the relevance, coverage, and understandability of the items and response options of a PROM [7,14].Indeed, content validity is considered to be the most important measurement properties of a PROM [15] already questioning the suitability of the questionnaire.In addition, our psychometric evaluation of VISA-P using robust methods such as CFA or IRT also showed fawed construct validity.Terefore, the sufcient construct validity and responsiveness of the VISA-P found in the COSMIN review [44] based on the use of exploratory factor analysis and correlation with legacy instruments (criterion validity) cannot be confrmed.Hence, the results of the present study indicate inadequate measurement properties, and we suggest that the VISA-P should not be used or recommended for evaluation of patient with patellar tendinopathy.Moreover, it is paramount that researchers acknowledge the faws of proposed measurement instruments, to avoid impeding further investigations to improve or develop better instruments.Until a new relevant condition-specifc PROM for patellar tendinopathy has been developed, one can consider reanalyzing the existing data using a one-dimensional scale with only items 2-6, which was confrmed adequate in the present paper using CFA and Rasch analysis.

Conclusion
Rigorous psychometric assessment of the Danish version revealed that VISA-P does not satisfy a measurement model, lacks unidimensionality, and exhibits considerable DIF driven by the healthy subjects.A reduced 5-item unidimensional VISA-P scale was supported by the psychometric assessment, although content coverage remains unknown.A new relevant PROM for patellar tendinopathy should be developed and appropriately validated.Meanwhile, simple pain scoring (e.g., numeric rating scales) and functional tests are suggested as more appropriate outcome measures for studies of patellar tendinopathy.

Table 1 :
Demographic variables for the DIF analyses and the total VISA-P scores.

Table 2 :
Individual item ft and overall ft to the Rasch model.

Table 4 :
Multivariable regression analysis for all subject of diferential item functioning (DIF) on the covariates sex, duration of symptoms, body mass index (BMI), and age group.indicate DIF remains (only for Item 8 for BMI) when healthy people are omitted from the analysis. *