University of Southern Denmark Efficiency of af small size screening instrument in identifying children with autism spectrum disorders in a large population of twins

.


Introduction
Autism spectrum disorders (ASDs) represent the expanded concept of autism. ASD overlaps the category of pervasive developmental disorders (PDDs) in the ICD-10 International Classification of Disease [1] and in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR) [2]. ASDs comprise autism, Asperger's syndrome (AS), and other atypical forms, such as PDD not otherwise specified (PDDnos) and atypical autism. The core symptom is pervasive impairment in mutual social interaction, and additional deficits reflect either verbal or nonverbal communicative impairments and/or restrictive repetitive and stereotyped patterns of behaviour, interest, and activities. Diagnosis of ASD can already be made from the age of 2-3 years, and the majority of individuals with ASD are identified before adulthood [3]. Population-based prevalence estimates reach around 60-70 per 10,000 [4] though estimates higher than 1% have been reported too [5,6].
In epidemiological studies of ASD, questionnaires are widely used for screening [5,6]. Participation rates in population studies are often low, and there seems to be an inverse relationship between the length of a questionnaire and the response rate in population surveys [7]. To minimize attrition, epidemiologists suggest a two-phase screening procedure, with a first step of a few highly sensitive questions followed by a second step using a more specific and elaborate questionnaire [8]. This paper is based on screening data from a survey of all twins registered in the national Danish Twin Registrar (DTR), as a part of a study with the purpose of estimating the heritability of ASD. We conducted the study using a short 5-item scale in a first-phase screening. Item responses were linked to the Danish Psychiatric Central Research Registry (DPCRR), which registers all psychiatric diagnoses from the public mental health service program in Denmark.
An increased prevalence of ADS in twins as compared to singletons may associate twinning as a separate risk factor for developing ASD. Only few data exist on the prevalence of ASD in twins, and some of them are register based.
The prevalence of ASD in twins and the psychometric value of a short 5-item screen questionnaire were evaluated.

Materials and Methods
Denmark has had a national civil registration system since 1 April 1968. As part of this system, a unique personal identification number is assigned to all Danish residents [9], and this identifier is the key to individual information in almost all official registers of the Danish population, which gives the opportunity to link at an individual level across all registers to gain access to research data.

The Danish Twin
Registry. The Danish Twin Registry was established in 1954 and is a nationwide, populationbased register of Danish twins born from 1870 [10]. Twins have been ascertained in a variety of ways, but since 1973, the ascertainment has been complete due to linking to the Medical Birth Registry (MBR) [11].

The Danish Psychiatric Central
Register. The Danish Psychiatric Central Research Register (DPCRR) was established as a nationwide computerised register in 1969 and contains person-identifiable information (e.g., psychiatric diagnoses) about all psychiatric hospitalisations in public hospitals in Denmark (including Greenland and the Faeroe Islands) [12]. Since 1995, all out-patient activities (psychiatric outpatient clinics and community psychiatric centres) have been included as well. In Denmark, there are relatively few private child and adolescent psychiatric clinics which do not report to the DPCRR. Private clinics treat approximately 3,300 children and adolescents on a yearly basis, compared to about 13,000 in the public health care system. Since 1994, the diagnostic classification system has been ICD-10 [1], before that the ICD-8 was used [13], and all registered diagnoses are approved by a senior psychiatrist. The coverage of ASD diagnoses in DPCRR is believed to be high [14]. The survey represented a joint initiative between various research groups and the Danish Twin Registry, and we were limited by the number of questions (five) offered to use in this survey. A questionnaire including 24 questions in total about diabetes, asthma, eczema, epilepsy, and behaviour was mailed to the home address of 24,552 twin individuals for the purpose of studying issues of heritability. For each twin, the parents were asked to rate recent behaviour (time Table 1: Questions from the survey. (a) The child exhibits strange behaviour? (b) The child exhibits a sudden change in mood or feelings? (c) The child cannot get along with others? (d) The child cannot sit still and is restless or hyperactive? (e) The child cannot pay attention for long? Scored according to 0 = not true, 1 = sometimes true, and 2 = often true, total score range 0-10. not specified further) on five questions taken from the child behaviour checklist (CBCL) ( Table 1), each scored on a three-point scale (0 = not true, 1 = sometimes true, and 2 = often true, total score range 0-10 points). Zygosity was established on the basis of four questions on similarity and mistaken identity, which assigns correct zygosity in more than 95% of all same-sexed twin pairs [15].
CBCL is a well-validated and reliable questionnaire used worldwide for the screening of childhood psychopathology [16,17]. The CBCL has been standardised and validated in Danish population-based and clinically based samples [18,19]. The five selected items were found to be very sensitive to ASD (questions a and b) and ADHD (questions c, d, and e) in a Danish epidemiological study regarding psychopathology in school-age children [20]. In this latter study, children with ASD (n = 11) all scored ≥3 on the fivequestion scale (unpublished data). In another clinical sample of 74 children (mean age 10.7 years; SD 2.7, range 7-17 years) diagnosed with ASD in our department and where parents had answered the CBCL at referral, the sensitivity of the five questions was 86.5% at cutoff ≥3 (unpublished data).
In order to evaluate the 5-item scale, the scores from the twin survey were linked to the DPCRR. ASD cases were defined as probands whom after clinical assessment were registered with autism (ICD-8: 299.00 (psychosis proinfantilis) or ICD-10: F84.0), atypical autism (ICD-8: 299.01 (psychosis infantilis posterior) or ICD-10: F84.1), Asperger's Syndrome (ICD-10: F84.5), and pervasive developmental disorder not otherwise specified/other (ICD-10: F84.8 and F84.9) [1,13]. The diagnostic stability of an ASD diagnosis over time is generally very high [3], and therefore all twin individuals were included if they were registered with an ASD diagnosis at any time during the observation period (1 January 1988-23 March 2009). The study was approved by the Danish Data Protection Agency and the National Board of Health.

Statistical Analyses.
The screen efficiency of the five questions together and separately was assessed by examining the sensitivity, specificity, positive predictive value (PPV), and the correct classification rate (CCR) at various cut-off scores. This was done by using twins with ASD diagnoses registered in the DPCRR as "true positive" cases and those twins not registered with ASD as "true negative." Statistical significance was referred to at a 5% level, and 95% confidence intervals (CIs) were reported. A receiver-operating characteristic (ROC) analysis was performed to determine the discriminating power of the 5-item scale. The ROC curve is obtained by plotting the true positive rate (sensitivity) on the vertical axis and the false positive rate (1-specificity) on the horizontal axis for each cut-off level. High values for both of these measures are desirable. The most favourable cutoff score is where the product of sensitivity x specificity is maximal. Since the cotwin data violated the assumption of independence, robust regression analysis (which accounts for clustering) was used in STATA to estimate to which extent variables accounted for the variation of another variable. Mean age was measured at study start. The effects of age and gender on the total score were tested using robust linear regression analysis (cluster) in the responding case sample (twins registered in DPCRR with ASD) and noncase sample (twins not registered with ASD). The effects of gender and age on response status and differences in responding status between the samples were tested by robust logistic regression analysis. Age was analyzed as categorical variables (age of 3, age of 4, etc.) in the regression analysis.

Results
Parents of 24,552 twin individuals received the survey by mail, and it was returned for 16,787 individuals (response rate 68.4%), among them 50.6% were boys. Attrition was significantly skewed in relation to the number of ASD cases registered in the DPCRR in the responding group (n = 108 of 16,787) as compared to the nonresponding group (n = 68 of 7,765) (X 2 = 4.028, P < .05). The sum score for the 5-item scale was strongly negatively skewed for the total population of responding twins (skewness 2.72, kurtosis 12.35), and the curve showed a step from score 0 to 1 (Figure 1). Neither age nor gender had significant effect on the responding status though the response rate tended to be higher in the younger twins. Table 2 shows the mean sum scores of the case and noncase samples stratified on gender.  Table 3, the questionnaire percentiles (total score) in the case sample are shown. ASD diagnoses were distributed as 33 with autism, 20 with atypical autism, 29 with Asperger's syndrome, and 26 with PDD-nos or other. Age did not influence response status, but the boy : girl ratio in the responding case sample (n = 108) was 2.7 : 1, as compared to 6.6 : 1 in the nonresponding case sample (n = 68) (OR = 0.41, 95% CI: 0.17-0.96, P < .05). Neither age nor gender had any effect on the score in the responding case sample although boys tended to be of higher scoring (Table 2).

The Noncase Sample.
There was a significant effect of gender; boys scored higher than girls on the mean sum score (mean difference = 0.21, see Table 2) with no overlap of CI for the mean sum score of the two genders. There was no effect of age on total score.
The discriminative values of each item separately (score range 0-2 each) are presented in Table 5. The item reaching the best sensitivity when all ASD subdiagnoses were included was question b ("The child exhibits a sudden change in mood or feelings") at cutoff ≥1 (sens. 67.6%, spec. 73.7%, CCR 73.6%) followed by item e ("The child cannot pay attention for long"), but the former had much lower specificity. Item a ("The child exhibits strange behaviour") had a very high specificity (95.9%) at cutoff ≥1, compared to the other items, at the same cutoff. Regarding the sample with autism diagnoses only, item e achieved the best sensitivity (84.9%) at cutoff ≥1, followed by item c ("The child cannot get along with others") and a. Overall, item d ("The child cannot sit still and is restless or hyperactive") obtained the lowest sensitivity compared to the other items. The combination that seemed to be most efficient as regards sensitivity was the combination of items a, b, c, and e, which at the most optimal cutoff (≥2, score range 0-8 points) yielded a sensitivity of 75.9% (95% CI: 66.7-83.6%) and a specificity of 85.7% (95% CI: 85.2-86.2%) when all ASDs were included in the case sample. Reducing the case sample to F84.0 (ICD-10) and 299.0 (ICD-8) (autism) only, the optimal cutoff was ≥3 (score range 0-8 point) at a sensitivity of 81.8% (95% CI: 64.5-93.0%) and a specificity of 93.9% (95% CI: 93.5-94.2%).

Discussion
Under some circumstances, short-form screening tools are required. The screen efficiency as regards ASDs of five selected items from the CBCL was evaluated in a nationwide population-based sample of twin individuals by means of linkage to a national psychiatric research register recording in-and out-patient activity. We found that this short scale was useful as a first step in a multistage ASD population screening procedure and had reasonable psychometric power to catch cases and rule out the majority of noncases. The distribution of the 5-item scale score in our population was heavily skewed to the low scores, which is comparable to other instruments used for the screening of ASD in general population samples [21,22]. Only few screening scales have been evaluated for use in population samples, among those is the autism spectrum screening questionnaire (ASSQ) including 27 items [23]. In a populationbased validation study in 7-9 years old, the sensitivity and specificity of ASSQ proved very high (0.83-0.91 and 0.77-0.87, resp.). As compared to more specific screening tools like the ASSQ, it is obvious that this 5-item scale does not perform as well because of fewer and probably less ASDspecific items. However, the confidence intervals for the area under the ROC curve of the ASSQ (ROC AUC = 0.90, 95% CI 0.85-0.95) overlap with ROC AUC of our short scale. At the optimal cut-off score for the present scale, nearly 80% of the twins registered with an ASD diagnosis were identified (sensitivity). Excluding the youngest children in the sample of whom a larger proportion of actual cases may not have reached clinical status and therefore nonexistent in the register only slightly elevated the discriminative values. Boys were overrepresented in the nonresponding case sample compared to the responding sample (P < .05), which could have an effect on the dispersal of scores. However, other screening instruments for ASD used in clinical samples have often failed to find score differences between the genders [24,25]. Attrition was significantly skewed in relation to the number of ASDs registered in the DPCRR in the responding group compared to the nonresponding group (P < .05). This may also influence the sensitivity and specificity estimate and support that nonresponders in epidemiological samples are often more deviant than responders and therefore reduce the representativeness of the present scale.
As the observation period ran from January 1988 to March 2009, and the data was collected in 2003, some parents probably received the questionnaire after their children were diagnosed with ASD and some before assessment. This might have an impact on the scores in the case sample, some parents at that stage being more aware of the typical ASD behaviour and being better spotting it. However, in the questionnaire, we did not inform that the behaviour we were asking about was ASD or ADHD specific. Epidemiology Research International 5   More weight is typically given to sensitivity than specificity, as instruments are often designed to identify at-risk populations (like this study). Moreover, there exists a tradeoff between specificity and sensitivity, and in case a clinical study is included, the social cost of a false negative may be higher than that of a false positive in terms of delaying referrals to a specialist and subsequently delaying appropriate intervention and services. If using a multiphased design, a highly specific screen tool is typically added as a second phase to a highly sensitive first-phase tool, thereby decreasing the burden and the higher risk of attrition than if the often extensive ASD questionnaires are used in the first phase.
As expected, the positive predictive value (PPV) for the 5-item scale was very low. When screening for a relatively rare disorder like ASD in the general population, the low prevalence rate creates poor predictive value for the test, in spite of relatively high sensitivity and specificity. This is in contrast to what is found when using the test in a sample of children referred to a child mental health clinic, where relative occurrence of ASD is expected to be higher.
Validity of the autism diagnosis (F84.0) in the DPCRR has recently been evaluated in a large-scale study where medical records from 499 cases were reviewed and coded according to a scheme developed by the Centre for Disease Control and Prevention [26]. Diagnosis was confirmed in 469 cases, corresponding to an index of validity of 0.94 [14]. However, the quality of the other diagnoses within the autistic spectrum in DPCRR still remains to be investigated.
Exploring the discriminative values of the five questions separately, the "ASD-specific" questions did not differ considerably from the more "ADHD-specific" questions. This is not surprising since it is well known that autistic features are commonly associated with ADHD, especially in attention [27]. Studies based on clinically referred children and adolescents with ASD have shown up to 78% and also have ADHD symptoms sufficient to fulfil the diagnostic criteria of DSM-IV [28]. As regards optimising the sensitivity, the best combination included four of the five questions (a, b, c, and e), leaving out the question on "hyperactivity" (question d).
A high survey response rate is associated with a short questionnaire [7]. As such, this short 5-item (or 4-item) questionnaire offers an effective first-phase screening for identifying higher-risk populations as regards ASD. However, the usefulness of the scale would be amplified if the 6 Epidemiology Research International discrimination ability between children with ASD and other behavioural or developmental disorders was illuminated and the dispersion of the scale scores in our twin samples could be compared to those in singletons. Other survey data of autistic behaviour in twins have been shown to differ from nontwins, especially in males [29]. This could be due to "true" differences, but contrast effects (parents rating the twins to be less similar than they really are) and assimilation (parents accentuate similarities between twin pairs) might influence the result. As the design of the study did not allow the inclusion of teacher ratings or any objective measurements of behaviour, the possible influence of such biases could not be estimated.
Subscales within broad band screening instruments like CBCL and the developmental behaviour checklist (DBC) have proved useful in identifying and assessing social and other behavioural dysfunction in autistic children [30,31]. Bölte et al. [32] found that autistic children (n = 77) obtained significantly higher scores on the subscales "thought problems," "social problems," and "attention problems" as compared to either a large normative sample (n = 2, 856) or a large clinical sample (n=1, 655). Duarte et al. [33] reported too that the CBCL subscales "thought problems" (7 items) and "autistic/bizarre" (5 items) differentiated autistic children from children with other psychiatric disorders very well. In a very recent study by Ooi et al. [34], a new ASD scale, constructed from nine CBCL items, demonstrated moderate to high sensitivity (68 to 78%) and specificity (73 to 92%) as compared to different clinical samples. The item "strange behaviour," included in the 5-item questionnaire in the present study, was represented in both the subscales suggested by Duarte et al. [33] and by Ooi et al. [34] for identifying ASD.

The Prevalence of ASD in Twins.
The point prevalence of ASD in this twin sample is comparable to the recently estimated prevalence in singletons of 60-70 with ASD per 10,000 across different regions and countries [4]. It might, however, be a conservative estimate, as some studies have yielded prevalence estimates around 1% [5,6]. The pointprevalence estimate is obtained despite registered clinical ASD diagnoses usually yielding lower prevalence estimates than found in prospective research projects. This tendency was obvious in a study using the DPCRR to estimate "corrected" prevalence (34.4 per 10,000) of ASD in children younger than 10 years [35]. Barbaresi et al. [36] found that less than half the research-identified cases had received a clinical diagnosis of ASD in a time span of 18 years overall (1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997), increasing though to 74% in the last part of the observation period (1995)(1996)(1997). The opposite may also be the case, however: children who have locally received an ASD diagnosis, without fulfilling the research criteria (personal experience).
As our point prevalence might be a conservative estimate, it raises the question whether the prevalence of ASD in twins is higher than in nontwins, and evidence in this field has been conflicting. A disproportionally high number of twins among affected sibpairs with autistic disorder were found as compared to the expected numbers on the basis of the prevalence of twinning [37,38]. But both results derived from studies focused on multiplex affected sib-ships, and the twins were not sampled from a systematic population-based cohort collected according to epidemiological principles. These research results may reflect ascertainment bias [39]. A large epidemiologic sample from Western Australia found only a slight to moderate increase in the rate ratio (1.35) for multiple births comparing autism to total births, a ratio that would fall to 1.15 when including the broader spectrum [39]. Two other population-based studies, in California [40] and Sweden [41], supported these results. Most recently, Braun et al. have expelled twinning as a risk factor for ASDs based on the US autism and Developmental Disabilities Monitoring (ADDM) Network registration [42]. Twin pairs concordant for autism or other ASD might have a higher chance of being referred for assessment. Additionally, it seems plausible that there is an increased risk of diagnosing ASD in the cotwin of an individual already diagnosed with ASD, which could explain the findings of slightly elevated ratio of multiples with autism/ASD as compared to singletons in some studies [39]. Such bias may also account for the prevalence of registered ASD seen in our twin sample.

Conclusion
This study adds further information to the literature regarding screening constructs for ASD. The effectiveness of a five-item short scale using child behaviour checklist (CBCL) items was evaluated in a first-phase screening of all Danish twins born in 1988-2000 as a part of a heritability study design. In Denmark, psychiatric diagnoses are all reported to DPCRR, and the two registers were linked for validation purposes. The scale showed reasonable psychometric power to catch cases and rule out the majority of noncases in a general population sample of twins. It may prove useful for researchers and clinicians in health care services in need of a short cost-effective tool for ASD for use in large population samples where efficiency, cost, and response load must be taken into consideration. Few data exist on the prevalence of ASD in twins. This study documented the point prevalence of registered twins with ASD to be comparable to ASD prevalence data in singletons. Thereby, the result does not support the twinning process itself as a risk factor in the aetiology of ASD.