Assigning Clinical Significance and Symptom Severity Using the Zung Scales: Levels of Misclassification Arising from Confusion between Index and Raw Scores

Background The Zung Self-Rating Depression Scale (SDS) and Self-Rating Anxiety Scale (SAS) are two norm-referenced scales commonly used to identify the presence of depression and anxiety in clinical research. Unfortunately, several researchers have mistakenly applied index score criteria to raw scores when assigning clinical significance and symptom severity ratings. This study examined the extent of this problem. Method 102 papers published over the six-year period from 2010 to 2015 were used to establish two convenience samples of 60 usages of each Zung scale. Results In those papers where cut-off scores were used (i.e., 45/60 for SDS and 40/60 for SAS), up to 51% of SDS and 45% of SAS papers involved the incorrect application of index score criteria to raw scores. Inconsistencies were also noted in the severity ranges and cut-off scores used. Conclusions A large percentage of publications involving the Zung SDS and SAS scales are using incorrect criteria for the classification of clinically significant symptoms of depression and anxiety. The most common error—applying index score criteria to raw scores—produces a substantial elevation of the cut-off points for significance. Given the continuing usage of these scales, it is important that these inconsistencies be highlighted and resolved.


Introduction
Of the conditions contributing to the global disability burden of mental illness, anxiety and depression are the most prevalent disorders [1,2]. However, while these are conceptually distinct constructs [3,4], they present as highly comorbid conditions [5,6]. Further, while an absence of positive affect is considered unique to depression, and other specific factors are unique to particular anxiety disorders (e.g., physiological arousal to posttraumatic stress disorder and panic disorder), the presence of a high level of general distress and negative affect is common to both types of disorder [7,8]. For these reasons, researchers and clinicians often concurrently screen for the presence and severity of both disorders using selfreport psychometric tools developed for this purpose.
Self-report measures of mental disorders may be criterion-referenced or norm-referenced. Criterion-referenced measures are used to make a diagnosis based on the endorsement of criteria listed in published diagnostic classification systems. Individuals are diagnosed with or without a disorder based upon the presence or absence of these criteria [9,10]. In contrast to criterion-referenced measures, norm-referenced measures compare individuals' test results to those of an appropriate peer or normative group. These scales typically suggest score ranges linked to symptom severity descriptors and have a "clinically significant" total scale score cut-off point beyond which scores are considered indicative of the presence of a disorder (see Table 1). The Zung Self-Rating Depression Scale (SDS) [11] is a commonly utilized norm-referenced scale. The SDS is a 20item Likert scale covering symptoms that were identified in factor analytic studies of the syndrome of depression [11]. Items tap psychological and physiological symptoms and are rated by respondents according to how each applied to them within the past week, using a 4-point scale ranging from 1 (none, or a little of the time) to 4 (most, or all of the time). The scale has a raw score range of 20 to 80 points. The raw score is then converted to an index score by dividing the raw score 2 Depression Research and Treatment  [12]. Zung [13] also devised a similar 20-item scale to screen for the presence of clinical anxiety: the Self-Rating Anxiety Scale (SAS). Items tap affective and somatic symptoms selected from the diagnostic criteria listed in the Diagnostic and Statistical Manual of Mental Orders (DSM II) current at the time [14]. The scoring structure and index score conversion is similar to that for the SDS. However, for the SAS, the situation regarding cut-off scores is less clear: Zung [6] noted in an early study that all "normal subjects" returned an SAS index score below 50, but later he set an index score of 45 (raw score = 36) as a cut-off point for clinically significant anxiety [15]. Moreover, score ranges for degrees of severity have not been published in the scientific literature.
Unfortunately, the literature reveals a number of discrepancies in the way the Zung scales have been used, reported, and interpreted. In particular, several researchers have mistakenly applied index score cut-offs to raw scores in assigning clinical significance and symptom severity ratings (e.g., [76,118,119]). In their Methods sections, these researchers describe the calculation of a total raw score and a "cut-off" score of 50 for morbidity. However, "50" is the index cut-off score set by Zung for the SDS, and this equates a raw score of 40. Using a raw score cut-off of 50 considerably reduces the proportion of cases classed as clinically significant. Another issue is that some researchers have applied severity range descriptors to the SAS when, as stated above, no such descriptors exist in the literature (e.g., [119]).
It is likely that errors in the scoring and interpretation of the Zung scales emanate from two sources that involve a failure to refer to the original publications. One is a reliance on the (erroneous) scale descriptions of other authors. The other is that some clinicians and researchers may have accessed scale information from sourcebooks of psychometric measures where distinctions between index and raw scores are imprecise. For example, both Fischer and Corcoran [120] and Schutte and Malouff [121] fail to clearly specify that recommended cut-off points are based on index and not raw scores. This paper examines the extent to which these scales have been incorrectly interpreted in the literature. Given the scales' continued application, it is important that these inconsistencies in interpretation are highlighted and corrected.

Method
To investigate the extent to which the Zung scales are being wrongly applied, a search of the ProQuest full text database was conducted. Searches were done for each of the six calendar years from 2010 to 2015, using the terms "Zung Depression Scale," "Zung Self-Rating Depression Scale," "Zung Anxiety Acale," "Zung Self-Rating Anxiety Scale," and "Zung Self-Rating Scale." Searches were limited to scholarly articles. For each calendar year, the results were examined in the order presented by the database search engine and, for both the SDS and the SAS, the first ten articles that used that scale to collect new data were selected to form a "convenience" sample indicative of recent use of these scales. Articles reporting on studies using both the SDS and the SAS were included, but theoretical articles and metaanalyses were not. In total, 102 articles were sourced and explored for misinterpretation of both scales. The disciplines covered in the articles were psychiatry (25%), psychology (9%), cardiology (6%), oncology (5%), neurology (5%), and gynecology (5%). The remaining 45% were other medical disciplines.
The results for each paper were recorded against a checklist. Examination initially focused on whether cut-off scores and severity ranges were applied. When this was done, the usage was coded according to the following categories: (1) Consistent use of raw scores: paper uses raw scores only with cut-off scores and/or severity ranges appropriately modified.
(2) Consistent use of index scores: paper details conversion to index scores and uses index score cut-offs/severity ranges.
(3) Incorrect application: index cut-off scores/severity ranges are specifically applied to raw scores.
(4) Unclear application: paper uses index cut-off/severity ranges without mention of conversion from raw scores: however, there was no conclusive evidence that this was not done.
(5) Not utilized: cut-off scores/severity ranges are not stated or used.
Depression Research and Treatment 3           Notes were also taken where the cut-off and severity ranges applied were different from Zung's [12,15] recommendations.

Results
Cut-offs for the presence of a disorder were applied in 45 of the 60 papers where the SDS was used and in 40 of the 60 where the SAS was used. For the SDS, index cut-offs were incorrectly applied to raw scores in 16 (35%) of these 45 papers, with a further 7 (16%) papers in which application was unclear. For the SAS, 8 (20%) of the 40 papers revealed incorrect application, with a further 10 (25%) being unclear ( Table 2). As shown in Table 3, the level at which cut-offs were set did not always accord with Zung's recommendations. In particular, alternative norms have been developed for use in Chinese populations with the cut-off for the SDS set at an index score of 53 (raw score, 42) and, for the SAS, at 50 (raw score, 40). Further, one of the SDS papers used the SAS cutoff (index 45, raw 36) and three of the SAS papers used the SDS cut-off (index 50, raw 40). Another three papers used the newly developed SDS cut-offs for a Chinese population but applied these to European samples. Finally, one of the SDS papers set a much higher cut-off index score of 60 (raw 48).
Severity ranges were utilized considerably less often. Specifically 23 of the 60 SDS papers included them but in 9 (39%) of these cases, index score ranges were incorrectly applied to raw scores, with a further 5 (22%) cases falling into the unclear category. Figures for the SAS followed a similar pattern despite the absence of any official ranges in the scientific literature. Twenty of the 60 SAS papers include such scales, with index score ranges being incorrectly applied to raw scores in 7 (35%) of these cases, with a further 7 (35%) falling into the unclear category ( Table 4).
The most common severity range applied to the SAS is based on the recommended cut-off of 45 (index). In index score terms, severity ranges are 45-59 mild to moderate anxiety, 60-74 moderate to severe anxiety, and 75+ severe anxiety. Thirteen of the 20 SAS papers utilizing severity ranges employed the above. A further four used the SDS severity ranges, while two utilized different ranges altogether and the final paper merely specified descriptors without detailing the numerical criteria. The recommended SDS severity ranges were applied in all SDS papers but two, which instead used the "unofficial" SAS ranges detailed above.

Discussion
This study examined a sample of recent scientific publications for the application of raw scores, index scores, and symptom severity ranges when interpreting total scores on the Zung SDS and SAS. Although the findings were based on a "convenience" rather than a random sample of papers, they provide clear evidence of a significant problem in the application of Zung scales across the literature. On the basis of the papers examined here, confusion between raw and index scores means that when cut-offs are applied to indicate the presence/absence of disorder, they are applied incorrectly in 35-51% of cases for the SDS and 20-45% of cases for the SAS (depending on the proportion of unclear cases that involve incorrect application).
This incorrect application of index score cut-offs to raw scores substantially elevates the score required to be classified in the clinical range: in index terms, from 50 to 63 on the SDS and from 45 to 56 on the SAS. The potential impact on study findings does not need elaboration.
Quite apart from the issue of cut-off scores being incorrectly applied, the inconsistency introduced by the use of two distinct sets of scores to represent the same scale makes crossstudy comparisons unnecessarily difficult. (Across the studies in our sample, raw scores were used approximately 40% of the time and index scores on 60% of occasions.) Given that the transformation to index scores achieves no purpose other than to decimalize the maximum score, the simplest solution might be to abolish the use of index scores altogether.
Additionally, some confusion exists between the two Zung scales with SDS cut-offs applied to the SAS and vice versa. The same applies to severity ranges for the two scales, if one accepts that an unofficial scale for the SAS has evolved in the literature. The scientific basis of this scale remains highly questionable.
The Zung scales continue to be widely used and potentially remain a valuable means of screening for the presence of anxiety and depression. However, if scale scores are to be reliably interpreted, it is a matter of some urgency that current confusion regarding scale cut-off and severity ranges is resolved and the application of these scales is standardized in future studies.

Conflicts of Interest
The authors declare that they have no conflicts of interest.