Reliability and Validity of the Czech Version of the Pittsburgh Sleep Quality Index in Patients with Sleep Disorders and Healthy Controls

Objectives Psychometric properties of the Czech version of the Pittsburgh Sleep Quality Index (PSQI-CZ) have been evaluated only in patients with chronic insomnia, and thus, it is unclear whether PSQI-CZ is suitable for use in other clinical and nonclinical populations. This study was aimed at examining the validity and reliability of the PSQI-CZ and at assessing whether the unidimensional or multidimensional scoring of the instrument would be recommended. Methods A total of 524 adult subjects from the Czech population participated in the study. The internal consistency of PSQI was evaluated using Cronbach's alpha. The known-group validity was tested using the Kruskal-Wallis H test to verify the difference between patients with sleep disorders and healthy control sample. For testing the structural validity, a cross-validation approach was used with both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). For EFA, the maximum likelihood method with direct oblimin rotation and parallel analysis was used. Results The internal consistency of PSQI-CZ items was moderate (α = 0.75). Receiver operating characteristic (ROC) curve analysis showed high specificity (0.79) and moderate sensitivity (0.64) using an optimal cut-off score of 10. The EFA revealed a 3-factor structure with factors labelled as “sleep duration and efficiency,” “sleep disturbances and quality,” and “sleep latency.” The CFA showed that the emerged 3-factor model had a partly acceptable fit, which was better than other previously supported models. Conclusions A high cut-off score of 10 is recommended to define poor sleep quality. Given the inconsistency of structural analyses, alternative scoring was not recommended. However, the individual components in addition to a total score should be interpreted when assessing sleep quality. We recommend editing and verifying the PSQI-CZ translation.


Introduction
Disturbed sleep represents one of the most frequent health issues. It has been shown that more than half of the adult population of economically developed countries experience unpleasant sleep disturbance [1]. The functioning of the sleep cycle can be verified by objective methods such as polysom-nography or actigraphy. However, when assessing sleep, it is important to take into account the subjectively perceived quality of sleep as well as other variables such as comorbidities and environment. If we look at the quality of sleep by objective methods, sleep quality can involve several different parameters including sleep onset latency, sleep duration, sleep efficiency, and a number of awakenings [2]. Disruption, abnormality, or irregularity of some of these measures leads to a decrease in sleep quality. The prevalence of symptoms of difficulty in initiating or maintaining sleep ranges from 10% to 48% in the general population [3]. Poor sleep quality can contribute to absence from work, accidents at the workplace, and increased risk of negative health consequences such as sleep and neuropsychiatric disorders [4].
Although polysomnography is considered the gold standard for measuring sleep quality, the Pittsburgh sleep quality index (PSQI) is the most commonly used subjective measure that assesses important aspects of the sleep quality and the presence of symptoms of frequent sleep disorders in both clinical and research settings (see more in Section 2.2.). The PSQI has been translated into more than 46 languages. All language versions are managed by Mapi Research Trust and are available subject to compliance with the prescribed conditions of use (research, clinical practice). It is unknown whether the Czech version of PSQI officially distributed by Mapi Research Trust is an appropriate translation of the instrument. As stated on the PSQI distributors' website (http://eprovide.mapi-trust.org), the listed translations may not have undergone a full linguistic validation process and may require further clarification. Nevertheless, studies of different language versions have demonstrated a good internal consistency (Cronbach's alpha coefficient ranging from 0.71 to 0.85) and appropriateness of using the PSQI in clinical and population studies [3,[5][6][7][8][9][10].
Validity and reliability of the PSQI have been verified by comparisons of healthy control groups with clinical populations of patients with psychiatric disorders [11,12], sleep disorders [8,9,13], or somatic disorders [14,15]. Although studies have shown good validity and reliability of the questionnaire across a different spectrum of research groups, there is no uniform concept of its structural validity. A recent review pointed out that most structural validation studies had some shortcomings (e.g., inappropriate sample, unused Kaiser-Meyer-Olkin test, Bartlett's test of sphericity, and lacking one of the factor analysis approaches or its relevant details). Insufficient or incorrectly chosen statistical methods may then create doubts about the described factor structures in individual research samples [16]. There are currently three most common model proposals. The original single-factor model suggests that a single summed total score best captures the multidimensional nature of sleep disturbance as indexed by the PSQI [11,12]. The original single-factor model was confirmed by several studies [17,18]. Other models question Buysse et al.'s combination of all seven PSQI components into one factor. Some suggest using 2-factor models (e.g., [5,6,14,19,20,21]). One of the more replicated models proceeds from a study by Magee et al. [19], who suggested the following factors: (1) sleep efficiency-based on the values of two components sleep duration and habitual sleep efficiency and (2) perceived sleep quality-based on subjective sleep quality, sleep latency, sleep disturbance, use of sleep medications, and daytime dysfunction [19,22]. Other studies copy Magee et al.'s model to the exclusion of the use of sleep medication component [20,21]. Others recommend a 3-factor structure, which is based on Cole et al.'s study [12,23,24]. Cole et al. proposed three factors: (1) sleep efficiency (based on sleep duration, habitual sleep efficiency), (2) perceived sleep quality (based on subjective sleep quality, sleep latency, and use of sleep medications), and (3) daily disturbances (based on sleep disturbances and daytime dysfunction) [12]. Although no consensus has been reached, the original unidimensional scoring system and further validation were more recently recommended [1,16].
Although the PSQI is widely used in research and clinical practice in the Czech Republic, psychometric characteristics of its Czech version (PSQI-CZ) have been evaluated only in patients with chronic insomnia [25]. Thus, the study was aimed at examining the known-group and construct validity and reliability (internal consistency) of the PSQI-CZ and at assessing whether the unidimensional or multidimensional scoring of the instrument would be recommended.  [26]. All subjects were examined with the Czech version of PSQI, which was distributed by Mapi Research Trust. Basic sociodemographic information (age, sex, and diagnosis) has also been obtained. Answers were filled out in a paper-and-pencil form among the general population and people with sleep disorders between 2015 and 2018. In the patient group, the diagnostic categories were determined according to ICD-10. The native language of all participants was Czech. We had the data available from a total of 583 adults. We then excluded individuals under 18 and above 80 years old. An incompletely or incorrectly filled out questionnaire was the second exclusion criterion. Finally, we excluded patients with the unspecific or combined diagnoses. In total, 59 subjects were excluded. We did not perform any multiple imputations to address the missing values. From the remaining 524 adult probands who were included in the study, 326 probands were sleep laboratory patients (patients with sleep disorders (SDis)); the remaining 198 subjects formed the control group (HC). The HC group consisted of volunteers from the Czech population who responded to the invitation to participate in the research and stated that they do not suffer from any sleep and psychiatric disorder while other somatic disorders were not monitored. Buysse et al. in 1989. It measures the quantitative and subjective aspects of sleep quality. The PSQI consists of 19 self-rated items and seven clinically derived domains of sleep difficulties in the past month: subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleep medication, and daytime dysfunction. Each of these domains is weighted equally on a 0-3 scale. The seven component scores are summed to yield the total global PSQI score, which ranges between 0 and 21 points. A total PSQI score > 5 denotes worse sleep quality [11], although some studies recommend that a higher cut-off score of 6 [8,9], 7 [18,27], 8 [15], or even 8.5 [13] would increase the PSQI's specificity and lead to a very small decrease in its sensitivity. The questionnaire also consists of 5 additional questions that are rated by a bed partner or a roommate. The latter five questions are used for clinical information only [11].

PSQI. The PSQI was developed by
2.3. Statistical Analysis. PSQI scores were not normally distributed both in the control and patient samples (Shapiro − Wilk < 0:01). The known-group validity was tested using the Kruskal-Wallis H test for the confirmation of the presence of the difference between patient and control samples. The effect size was calculated using eta squared (ε 2 ) and evaluated using following criteria: 0.01-<0.06 (as small effect), 0.06-<0.14 (as moderate effect), and ≥0.14 (as large effect). The test characteristics and an optimal cut-off score were calculated and tested using the receiver operating characteristic (ROC) curve [28]; the optimal cut-off value was estimated using two methods: by the position closest to the top-left corner of the curve and by using the maximum value of Youden index [29].
The internal consistency of the PSQI was tested with Cronbach's alpha [30]. A reliability statistic of 0.70 was considered acceptable, a range between 0.70 and 0.60 was questionable, and values lower than 0.60 were considered inadequate for the internally consistent instrument [31,32]. Independence on factors (age and sex) was tested using basic linear models.
For testing the factor structure, a cross-validation approach was used; i.e., the study sample was randomly divided into two adequately sized subsamples; the first subsample was used for factor identification using exploratory factor analysis (EFA). The Bartlett test of sphericity and Kaiser-Meyer-Olkin test were used for verifying the suitability for the analysis. We used the following criteria for factor extraction: eigenvalues > 1, loadings of items ≥ 0:35 [33], and all selected factors from the real data had to perform better in eigenvalue than factors from the random data. The maximum likelihood method with direct oblimin rotation was used for factor extraction, as we assumed correlation between components. The number of factors retained was estimated using parallel analysis, i.e., a data-driven approach comparing the observed eigenvalues of a correlation matrix with those from the random data [34].
The second subsample was then used for testing the emerged model and compare the goodness of fit with other published models using confirmatory factor analysis (CFA). Our proposed model was compared with previously published and supported models: the original 1-factor model [11], the 3-factor model first published by Cole et al. [12], and the 2-factor model first published by Magee et al. [19]. To assess model fit, multiple fit indices were used and considered good: comparative fit index (CFI) at ≥0.95 (or ≥0.90 for acceptable fit), Tucker-Lewis index (TLI) at ≥0.95 (or ≥0.90 for acceptable fit), standardized root mean square residual (SRMR) at ≤0.08, and root mean square error of approximation (RMSEA) at ≤0.05 (or ≤0.08 for adequate fit) along with 90% confidence intervals (90% CI). Statistically nonsignificant and lower chi-squared tests (χ 2 ) were also considered to identify better models [35]. To determine the best model which fits our data, all models were compared to each other using Bayesian information criterion (BIC), χ 2 difference tests (Δ χ 2 ), and RMSEA CI overlap. As in Cole et al. [12], a model was considered better fitted if at least two of the three criteria for significant differences were met; i.e., it had a lower BIC (by at least 10 points), lower nonoverlapping RMSEA CIs, and a significantly different Δ χ 2 where a model with lower χ 2 was better.

Results
3.1. Descriptive Statistics of the Studied Sample. The details of the subscale scores and total PSQI scores in our subsamples and whole sample are displayed in Table 1. The sleep disorder group included 196 women and 130 men (1.51 woman to man ratio, the significant difference observed, Kruskal-

Reliability
: Internal Consistency. We tested the reliability of the PSQI-CZ by estimation of PSQI-CZ internal item consistency using Cronbach's alpha coefficient. The overall internal consistency of PSQI-CZ items was adequate (α = 0:75). Dropping any of the components did not result in a higher internal consistency ( Table 2). The internal consistency of the PSQI was higher among patients (α = 0:71) than controls (α = 0:63).
ROC analysis showed high specificity (0.79) and low sensitivity (0.635) using a cut-off score of 10 specified as a point closest to the top-left corner of the curve. Using the identification of cut-off value using the maximum Youden index, the optimal cut-off value was 12 with very high specificity (0.94) and very low sensitivity (0.50). The original recommended cut-off score of 5 was highly unspecific (Table 3, Figure 1). The total area under the curve (AUC) was 0.80.

3.4.
Exploratory Factor Analysis. We tested structural validity using a cross-validation approach with both exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Prior to analysis, PSQI components were tested for sphericity using Bartlett's test (χ 2 = 464:45, p < 0:01) and sampling adequacy with the Kaiser-Meyer-Olkin test (KMO = 0:72). It was thus appropriate to proceed with EFA. Using EFA, a 3factor model was identified (Table 4) using data-driven parallel analysis with the sum of the squared loadings (eigenvalues) 1.38, 1.38, and 1.10. The first factor explaining 19.85% of variance was termed sleep duration and efficiency with the highest loading in sleep duration. The second factor was termed sleep disturbances and quality with the highest loading in sleep disturbances followed by subjective sleep quality and daytime dysfunction components. The third factor was labelled as sleep latency and was loaded by sleep latency and sleep medication use components. All components were included as none of the loadings reached the minimum critical value of 0.35. The whole model was able to describe 55.11% of the variability. The correlations between factors were moderate (r = 0:33-0.54) [42].

Confirmatory Factor Analysis.
To cross-validate our 3factor solution, the second half of our test sample was used for CFA. CFA was also performed on the original 1-factor model [11] and two established 3-and 2-factor models as in Cole et al. [12] and Magee et al. [19], respectively. The goodness-of-fit indices for all selected models were performed and are shown in Table 5. The goodness-of-fit statistics for our proposed 3-factor solution and for Cole et al.    (Figure 2).

Discussion
To the best of the author's knowledge, this is the first study to examine the psychometric characteristics of the Czech version of the PSQI in various study samples (patients with sleep   disorders and healthy volunteers). Our results demonstrate that the global internal consistency of the PSQI-CZ is lower (α = 0:75) than in the original study (α = 0:83) [11]. Given the characteristics of our sample, studies working with patients with sleep disorders show both similar [18], lower [25], and higher values of internal consistency [8,9,13]. Our Cronbach`s alpha was thus adequate and comparable to other studies that recommend the use of the questionnaire in clinical practice and research. Similar levels of Cronbach's alpha can be found in studies performed in psychiatric patients [7,43], cancer patients [14], the general healthy population [3,24], and adolescents [5,44]. In contrast to other previously published studies [5,6,13,18,27], dropping any of the PSQI components did not result in a higher internal consistency in our research sample. Similarly and in contrast to our findings, some studies tend to exclude one or more PSQI components (e.g., daytime dysfunction, sleep medications use) as a result of factor analyses [5,6,20,21,27,[45][46][47]. Our findings however allowed for keeping all components, which was also shown in previous studies [11,12,19,[22][23][24]48]. The differences in results may be attributed to diversity in sample characteristics. Our study included the general healthy population as well as patients with sleep disorders, which is in contrast to other studies validating PSQI in specific populations such as centenarians [27], adolescents [5], pregnant women [6], and psychiatric patients [7,43].
The PSQI factor structure is a controversial research topic as the widely used original one-factor model may not be satisfactory in all populations. In the present study, we used a cross-validation approach using the first EFA and a series of CFAs including the most published structures [11,12,19]. The results of our factor analyses did not show entirely consistent results. Our exploratory factor analysis revealed the same 3-factor structure as in the Peruvian sam-ple of college students in Gelaye et al.'s study [22]. Our structure was different from the original 1-factor structure [11] and other commonly proposed structures [12,19]. The 3factor model in Peru explained approximately 59% of the total variance [22], and ours comparably 55% of the variability. A confirmatory factor analysis verified our emerged structure but showed only a partly acceptable fit for our model. We found a similarly acceptable fit for a model from Cole et al. [12]. However, when we compared Cole et al.'s model with our model, our model resulted in a significantly better fit. Present findings thus do not confirm previously found support for Cole et al.'s structure in a Czech insomnia sample [25]. The discrepancy with other studies can be attributed to differences in studied populations, diverse sample characteristics, nonuniform methodologies (e.g., factor rotation and extraction methods, estimation method selection) and highlights the inconsistency of structural validity of the PSQI across varied clinical and nonclinical populations [1,27].
Together, our data point to limited usability of changing the factor structure or developing alternative scoring of the instrument. Based on the present findings, it is recommended that somnologists and other professionals should not solely rely on the overall PSQI score describing sleep quality. Instead, they ought to look at all components or at least at the components with consistently high loadings (i.e., sleep duration, subjective sleep quality, and sleep disturbances).
In line with other studies [8,13], our results showed that the patient group had a significantly higher total score of PSQI-CZ than general controls. The difference between these groups was confirmed by large effect size. Our findings point to an unexpected result of a high value of 10 for an optimal cut-off score, respectively, 12 using the maximum Youden index value criterion. We recommend using a cut-off score of 10 based on its clinical relevance, i.e., the best ratio  Figure 2: Confirmatory factor analysis (CFA) of our 3-factor solution of the PSQI. Ovals represent factors; rectangles represent seven components of the sleep quality subscales. Numbers next to rectangles denote standardized path coefficients, whereas numbers next to the factors represent factor correlations. 6 BioMed Research International between sensitivity (0.64) and specificity (0.79) in comparison to score 12 based on the Youden index with high specificity (0.95) but mediocre sensitivity (0.50). The traditional cutoff score (>5) has previously been reported to be insufficient to distinguish between healthy and diseased subjects, and higher cut-off scores have been proposed [13,15,49]. However, to the authors' knowledge, no other study proposed such a high cut-off score. Gomes et al. published that the optimal cut-off of 5 was to detect self-reported poor/good sleepers in nonclinical settings. To discriminate nonclinical from clinical sleep patients, the optimal cut-off was >7 [18,27]. Given the high average total PSQI score in our HC group, it is thus possible that the group included individuals who had undiagnosed or untreated sleep disorders. The absence of the disease does not mean that the person sleeps well and, conversely, that the patient with a certain diagnosis sleeps subjectively poorly [50]. Moreover, it can be assumed that people who entered the study as healthy controls may have a greater degree of self-observation and interest in health. A higher level of self-observation of various changes, differences, and symptoms can then reflect a higher score in the PSQI. High values in the overall PSQI score can be explained, especially for young adults, also by the influence of social factors such as demands during university studies [51], loneliness [52], interest in sports activities [53], or the action of blue light when using electronic devices [54]. Our study had several limitations. Firstly, the results of the correlations suggest that there may be a translation discrepancy in question number one for PSQI-CZ. Respondents might have mistaken the meaning of going to bed (lying down) with falling asleep when answering the first question of the PSQI-CZ. It would be worthwhile to make a linguistic adjustment of the Czech version and verify whether it changes the psychometric outcomes of the PSQI. Secondly, as subjects in our control group were considered healthy based on their self-assessment, the potential inclusion of persons with undiagnosed sleep disorders in the control group is a further limitation of our study. Nevertheless, we consider the findings important for three reasons. Primarily, our study is the first that mapped the statistical properties of the Czech version of the PSQI on a relatively large research sample which included both healthy controls and patients with sleep disorders. Secondly, the higher cut-off found for this translation is an important information for clinical practice. And finally, our data demonstrated a 3-factor structure of the Czech PSQI that was not found useful for establishing an alternative scoring system.

Conclusion
For the current official Czech translation of the PSQI, a cutoff score higher than 10 is recommended to define poor sleep quality. Furthermore, not only the total score but also the results of the individual components should be taken into account. It is suggested that PSQI-CZ with a modified question should be created to verify respondents' understanding of the meaning of questions. Further studies on the psychometric properties of PSQI-CZ in various research samples (e.g., general population, somatic disorders) including the test-retest reliability and verification of a modified translated version would strengthen our understanding of the potential benefits and limitations of PSQI-CZ in clinical and research practice in the Czech Republic.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval
The local institutional review boards approved the study (Ethics Committee of the General University Hospital Prague, No. 1774/15D; Ethics Committee of the National Institute of Mental Health, Klecany, Czechia, No. 170/16).

Conflicts of Interest
The authors declare that there is no conflict of interest.