When Things Are Not as They Appear: Assessing the Adequacy of Cluster Randomization When Outcome Events Are Rare at Baseline

The present study randomly assigned 15 Bahamian elementary schools to one of three intervention conditions. To assess the adequacy of cluster randomization, we examined two concerns identified by the local research team: inequality of gender distribution and environmental risk among groups. Baseline significant differences in risk and protective behaviors were minimal. There were significantly more males in the intervention group. Males had higher rates of risk behavior at all assessments. Poor school performance was also higher among the intervention condition and was significantly associated with increased rates of many but not all risk behaviors. Prior to adjusting for gender and school performance, several risk behaviors appeared to be higher after intervention among intervention youth. Adjusting for gender and school performance eradicated the group differences in risk behavior rates. Results demonstrate the importance of adequate randomization where outcomes of interest are rare events at baseline or differ by gender and there is an unequal gender distribution and the importance of the local research team's knowledge of potential inequalities in environmental risk (i.e., school performance). Not considering such individual differences could impact the integrity of trial outcomes.


Dissemination of Effective Prevention Programs.
With the growing awareness over the past decades that not all health promotion educational programs are equally effective, public health specialists and educators have urged the rigorous evaluation of prevention interventions and the subsequent utilization of only those found to be effective, for example, "evidence based" [1].
The utilization of evidence-based programs is becoming more achievable with the introduction of national programs such as the Center for Disease Control's (CDC) "Prevention Synthesis Project" and its "Dissemination of Effective Behavioral Interventions" (DEBI) program. The Prevention Synthesis Project has identified 69 prevention programs demonstrated through rigorous randomized, longitudinal evaluations to be effective in reducing one or more targeted risk behaviors [2,3]. The DEBI program seeks to implement these interventions in a manner likely to retain their effectiveness, balancing the competing needs for fidelity to the original intervention with attention to specific local environmental conditions that require design alterations [4][5][6].
As a practical matter, the efficacy trials of programs found to be effective have often been conducted in smaller settings than that of a school system [25]. Efforts to replicate the results of smaller efficacy trials in the context of large school systems may confront myriad implementation and evaluation issues [26] as well as limitations imposed by school regulations [8]. Attempts to minimize the impact of one design bias may unavoidably increase the risk of another bias. To avoid the potential of contamination of intervention conditions that may be more likely to occur when randomization is done at the level of the student or the class, researchers conducting school-based intervention trials commonly elected to randomize at the school level [8]. A review of smoking prevention programs found that 96% had been randomized at the level of the school [27]. However, allocation of participants by school may undermine the effectiveness of randomization in designing a trial, both because of the reduced number of units for randomization and because of the potentially greater heterogeneity between schools and/or greater homogeneity of the students or classes within schools. While ideally subjects assigned to various conditions should be comparable when the units for randomization are relatively homogenous and the process of randomization is independent of both the predictor and the outcome variables of a trial [28], researchers are nonetheless expected to assess if the randomly assigned groups are comparable with regard to key predictor or outcome variables. However, in situations in which the outcome or predictor variable(s) of interest is relatively uncommon and/or if there is a time lag between the intervention and the onset of the outcome of interest (such as early childhood exposures on the emergence of later life disease processes [29] or prevention interventions targeting risk or protective behaviors that are likely to arise months or years later [30]), it may be difficult to determine if the schools randomly allocated to the different conditions are equivalent until follow-up data becomes available. In situations in which an intervention has been delivered between baseline and followup, it may be impossible to determine if subsequent differences between the intervention and control groups resulted from the intervention or resulted from nonequivalence in the randomized units.
In a sexual risk prevention intervention specifically targeting preadolescents prior to sexual debut, there will be few sexually active youth at baseline in any of the participating schools and thus it will not be possible to determine if some schools are at higher risk than others are by comparing baseline rates of sexual initiation. Given the notion of cooccurrence of risk behaviors [31,32]; comparing other risk behaviors may represent an alternative to assessing the effectiveness of the random allocation. However, in settings in which sexual onset occurs at a particularly young age, there may also be little involvement in other co-occurring risk behaviors, thus reducing the utility of this proxy approach. With limited rates of risk behaviors, researchers may be dependent on the knowledge of those familiar with local conditions, underscoring the importance of local partners to the integrity of a randomized trial. In such a situation, researchers may need to use other proxies for risk behavior. Among the possible candidates for proxies of sexual risk behavior are measures of academic achievement. Academic achievement is a plausible proxy for risk because a robust literature supports an inverse correlation between risk involvement and academic achievement [33,34] suggesting that differences in levels of academic performance by school may compromise the validity of school-based randomization. Further, most schools will have some data regarding the academic achievement of their students compared to students in other schools. Consequently, among young children, academic achievement scores may provide an alternative index to assess the effectiveness of randomization independent of the outcome behaviors that were rare at baseline.
While ascertaining the relative proportion of males and females by intervention group is straightforward, controlling for differences in the distribution of behaviors that are attributable to gender but do not occur until after the intervention (e.g., are not apparent at baseline) also raises significant questions in data interpretation. When genders are equally distributed by intervention assignment, such potential confounding is not of concern. However, if randomization has not resulted in comparable gender distribution across the intervention conditions, then it will be important to assess the potential impact on intervention outcomes. While many risk behaviors may differ by gender and/or may vary by culture, earlier initiation of sex among males compared to females has been a consistent finding in most countries and cultures [35][36][37][38][39]. For example, among 33 nations from Africa, Asia, and Latin American included in a recent international survey, women initiated sex later than men in 31 nations; at a comparable age in one nation (Nigeria), and, at an earlier age in only one nation (Ghana) [39]. In the USA, analysis by gender among African American, Caucasian, and Asian youth found earlier onset among males, with a striking difference among African Americans; at age 13 years, 92% of females but only 72% of males were virgins and at age 14, 83% of females, and 58% of males remained virgins [36]. A survey conducted over a decade ago in Bahamian government high schools found that 57% of those 16 years and older were sexually experienced, including 70% of males and 41% of females [40].
In this paper we describe the methodological issues confronted in assessing the effectiveness of school-based randomization involving a relatively few number of schools (15) in a situation in which the outcome of interest (sexual risk) or proxies thereof (other risk behaviors) were rare events at the beginning of the trial given the young age of the subjects (grade 6 students). We present these issues in the context of evaluating the impact of an HIV prevention intervention delivered to preadolescents on subsequent risk and protective behaviors. Bahamian national members of our research team who were knowledgeable about the local conditions suspected that the randomization process had not resulted in comparable groups with regard to risk AIDS Research and Treatment 3 environment although baseline data indicated no differences in the key outcome variables (risk behaviors) because of the low frequency of these behaviors among the elementary school youth. We describe efforts to assess the adequacy of randomization using standard measures collected at baseline and followups (prevalence of sexual and other health risk behaviors and intentions and youth perceptions of risk behaviors among neighbors and family members) and, an index of academic performance (standardized school-based reading scores [33,34]). We also address the difficulties in interpreting data when there are obvious gender imbalances across intervention groups and gender is significantly associated with one or more of the targeted outcomes (in this case, risk behaviors), but such outcomes (risk behaviors) do not appear until after baseline and administration of the intervention.

The Setting: Preadolescent Sexual Risk Prevention in the
Bahamas. The Bahamas has the second highest annual incidence of AIDS in the Caribbean; 2.2% of adults are infected with HIV. Heterosexual activity is the predominant mode of HIV transmission. Currently 57% of non-AIDS/HIV cases are among individuals 15 to 34 years old who represent <20% of the population [41]. Based on the high HIV rates and the high-risk sexual behavior among adolescents beginning in their early teen years, the Bahamian Ministries of Education and of Health decided to identify and adapt an HIV risk prevention intervention targeting preadolescents in Bahamian grade 6 classes, ultimately selecting the evidencebased program "Focus on Youth with Informed Parents and Children Together" (FOY+ImPACT) [42,43] which they thought would be applicable to the Bahamas. Therefore, the Ministries of Health and Education of the Bahamas invited the US developers to collaborate on the adaptation and evaluation of this pair of interventions for Bahamian grade six students and their parents. In contrast to the setting for the effectiveness trials in the USA that were community based (recreation centers and housing development, [42,43]), The Bahamian government elected to conduct the trials within its government school system. The adapted program was entitled "Focus on Youth in The Caribbean" (FOYC) (a 10-session program plus two booster sessions) and a 1-hour parental monitoring intervention entitled "Caribbean Informed Parents and Children Together" (CImPACT)]. The US-Bahamas team evaluated the pair of risk reduction interventions through a randomized, controlled trial. The control condition for adolescents was a ten-session ecological intervention, titled "The Wondrous Wetlands" (WW) and the control condition for parents was a 20-minute career planning intervention, entitled "Goal for IT" (GFI).
Teachers and interested guidance counselors from the participating schools received a 5-day training workshop. The 10-session FOYC or WW curricula were delivered to students as part of the elementary school curriculum during class time. Parents received the appropriate intervention in groups on the weekend or evenings. Baseline and follow-up questionnaires (The Bahamian Youth Health Risk Behavioral Inventory (BYHRBI) [44], a cultural adaptation of the Youth Health Risk Behavior Inventory [45]), were administered to youth enrolled in the study to assess risk and protective knowledge, condom-use skills, perceptions, interventions, and self-reported behaviors.
We have described the trial in detail in prior publications [44,46,47]. Briefly, the trial was conducted among 1360 Bahamian sixth grade youth and 1175 of their parents. Youth and parents from 15 elementary schools located on the island of New Providence (from among the island total of 26 elementary schools) were randomly allocated at the school level to receive (a) FOYC+CImPACT (5 schools); (b) FOYC+GFI (5 schools); or (c) the WW+GFI (5 schools). Randomization was performed using the random digits method; in Year 1, nine schools were randomized to the three conditions (three schools per condition) and in Year 2, six schools were randomly added to each of the three conditions (two per condition). The analyses displayed hereon utilized data from one intervention group (FOYC+CImPACT) and the control group (WW+GFI), between which gender composition and several outcome variables were not comparable at baseline. As shown in Table 1 there were significantly more males in the FOYC+CImPACT group compared to the WW+GFI at baseline and at each follow-up assessment except the sixmonth followup. At all six waves the FOYC+CImPACT group was significantly older than the WW+GFI group. (Data from the FOYC+GFI group (5 schools) were analyzed and the findings are consistent with those presented here (data available upon request from the authors)).As reported in a prior publication [48], FOYC+CImPACT was found to increase condom use intentions, knowledge, condom-use skills, and perceptions at baseline and five follow-up assessments at 6, 12, 18, 24, and 36 months after intervention and condomuse behavior at 36 months followup. However, rates of sexual initiation were higher among FOYC+CImPACT youth compared to control youth at 36 months.

Data Sources and
Variables. Data used in this analysis were derived from two sources. Data used to assess adolescent behaviors and perceptions were derived from the assessment surveys (BYHRBI) we conducted during the randomized trial to evaluate the intervention trial. Data used to assess academic performance were derived from the annual English Grade Level Assessment Test (GLAT) as part of the Bahamas National Screening Programme [49]. More than 95% of elementary school students participate in the GLAT. The  11.0(0.7) 11.2(0.7) * 11.1(0.7) 11.2(0.7) * 11.0(0.7) 11.07(0.7) 12 months 11.6(0.7) 11.6(0.7) 11.7(0.8) * 11.6(0.7) 11.7(0.8) * 11.6(0. were generated by the number of correct answers divided by the number of total of questions (scores ranging from 0 to 1).
Condom-Use Skills. The students were given 15 statements and were asked to choose the eight correct steps in using a condom. For each correct response, they were given a point.
Final scores were reported as the mean number of correct responses (scores ranging from 0 to 1).
Condom-Use Self-Efficacy. Students' self-efficacy beliefs pertaining to condom usage were assessed through six questions assessing their self-perceived competency in hypothetical situations (5: strongly agree to 1: strongly disagree (Cronbach α = 0.87)).
Response Efficacy. Three items assessed the students' perceptions of the effectiveness of using condoms (5: strongly agree to 1: strongly disagree (Cronbach α = 0.71)).

Adolescent Risk Behaviors and Perceptions.
Adolescents were assessed on 11 behavioral items covering three domains: substance use, delinquency, and sexual activity. Although FOYC+CImPACT did not specifically target all of the substance use and delinquent behaviors, there were discussions about the adverse effects of drugs and alcohol on decisionmaking. The decision-making model used throughout the 10 intervention sessions spoke to the importance of thoughtful decision making in all aspects of daily life. Thus it is plausible to presume that at least some of the risk behaviors might be modified by the decision-making intervention (and therefore might be seen as significant differences between the intervention conditions in the follow-up assessments), an effect seen in earlier evaluations of FOY [42]. Two items specifically targeted by the intervention asked whether the students had ever engaged in sexual intercourse and if so how consistent was their condom usage ("consistent condom use" defined as "always" using a condom). For each behavior, a 36-month prevalence rate was calculated to assess whether the students had ever performed any of the risk behaviors after entering the trial. In addition to individual level, these behavioral variables were also summarized by school to assess the overall risk involvement of students in each of the 15 participation schools. Analyses examining differences between groups were conducted overall and separately by gender. Questions were asked regarding intentions to engage in sexual risk (have sex in the next six months) and protective (use a condom in the next six months if you were to have sex) behaviors.

Perceptions of Relative and Neighborhood Risk Behaviors.
The youths' perceptions of family and neighborhood behaviors were explored as an index of environmental risk. Youth reported how often they observed relatives and/or neighbors participating in substance-abuse-related activities. Items were coded as "sometimes or never" compared to "very often." For each school, frequency of observing the behaviors "very often" was computed as a percentage score.
Academic Achievement. The 2005 English Grade Level Achievement Test (GLAT) scores by school for students in grade 6 were used as an assessment of the English language and writing performance. Passing rates (scored C or better) were computed for each of the 15 participation schools and were used in the analysis.

Statistical Analysis.
Assessing the comparability of the subjects randomized to the intervention (FOYC+CImPACT) and control (WW+GFI) conditions with regard to sexual risk could not be accomplished by comparing rates of sexual risk behaviors at baseline because the rates were so low. Therefore, we compared proxies of sexual risk including AIDS Research and Treatment 5 demographic variables associated with increased risk (age and male gender), nontargeted but potentially covarying risk behavior intentions to engage in sexual risk and/or use a condom were compared using logistic regression to compute odds ratios for categorical variables and ANOVA and effect sizes were calculated for continuous variables while correcting for baseline differences and age. At baseline, there were differences in age among FOYC+ CImPACT youth (mean = 10.5 and SD = 0.8) and WW+GFI youth (mean = 10.4, SD = 0.6, P < 0.05). In addition, there was a significant gender difference with 53% of the youth in FOYC+CImPACT being male compared to 46% of those in WW+CGI (P < 0.05) (see Table 1). To better assess the comparability of the two randomly assigned groups, the analyses were conducted for males and females, respectively.
To assess whether academic performances (as measured by the English GLAT scores) were distributed evenly across the schools randomized to the two intervention conditions; we summed the GLAT passing rates among the five schools in each of the three intervention conditions. We did not have GLAT scores for the individual students; rather we only had access to the average GLAT scores by school and without standard deviations. Therefore, we conducted one-sample ttests to examine the difference between each school's GLAT score with the mean score of remaining 14 schools by using the GLAT of the case school as the test value (see "Potential Limitations" for discussion of the limitations of this approach).
We then examined the association of the English GLAT scores with the 11 variables assessing adolescent risk behaviors and six variables assessing perceived risk involvement by relatives and neighbors at the school level (N = 15) using correlation analysis. We reasoned that if there were a strong (e.g., coefficient r at P < 0.05) and directionally consistent (e.g., a consistent positive or a consistent negative) relationship between the GLAT score and a specific risk behavior over time, then the GLAT score at baseline could be viewed as a flag or indicator of significant propensity for the subsequent emergence of these risk behaviors. As such, if the GLAT scores were unevenly distributed across the randomly assigned intervention groups at baseline, this could indicate a lack of comparability of the randomly assigned groups.
To adjust for differences in risk propensity that can be attributed to the English GLAT scores with regard to risk behaviors or perceptions, we computed the adjusted measures by intervention conditions using a multilevel approach with the measures of risk behaviors for individual students as level 1 factors and the 2005 GLAT passing rates for individual schools as a level 2 factor [50]. The English GLAT passing rates were included in the multilevel models for assessing risk behaviors at all follow-up assessments. The adjusted measures indicate what the levels of risk behaviors or perceptions would be if the passing rates of the English GLAT were comparable across the intervention conditions. These analyses were conducted among the cohort of all intervention youth and all control youth controlling for gender and among gender-specific subgroups.
Data for these analyses were manually entered. Data quality was examined before analyses were conducted. Detected errors were corrected against the original survey data. Statistical analysis was conducted using the commercial software SAS (SAS Institute Inc, Cary, NC).

Results
As shown in Table 2, at baseline both intention to use a condom if having sex (a protective behavior) and intention to engage in sex in the next six months (a risk behavior) were significantly higher among FOYC+CImPACT youth compared to controls among the total sample and among the subsample of males. No significant differences in other sexual risk or protective perceptions, skills, or knowledge at baseline. Not shown in this table, GLAT scores were lower among FOYC+CImPACT schools (average 18.2 with three of the five schools having significantly lower mean scores of 11.2, 9.7 and 11.3 compared to the overall mean) compared to WW+GFI schools (average 24.2, with only one school having a significantly lower mean of 15.2 and one school a significantly higher mean of 31.7) (data available upon request). Table 3 depicts the frequency distributions of risk behaviors by intervention group. Overall, at baseline there were no significant differences between FOYC+CImPACT and WW+ GFI youth. Alcohol use was significantly higher among FOYC+CImPACT males compared to control males. There were no other significance differences by gender at baseline.
As shown in Table 2, postintervention, sexual initiation increased among both groups of youth, although the rate of initiation among all FOYC+CImPACT youth appeared to be faster than among WW+GFI youth; differences in sexual initiation between the two groups achieved statistical significance at 18 and 36 months. Sexual intentions were higher among FOYC+CImPACT in some waves. However, as noted above, given the significantly higher percentage of males in FOYC+CImPACT and the significantly higher rates of sexual initiation among males compared to females in both groups, the analyses were then conducted within gender subgroups. Among males only and among females only, there were no significant differences in either intentions or sexual initiation between FOYC+CImPACT and WW+GFI.
Condom-use intentions, after controlling for baseline differences, were significantly higher at all followups except the six month follow-up among FOYC+CImPACT youth overall and among males; these differences were significantly higher among FOYC+CImPACT females at 18, 24, and 36 months. Condom-use intentions and behavior increased over time among both intervention and control youth and among males and females. Also apparent in Table 2, HIV knowledge, condom-use skills, and condomuse perceptions were significantly higher after intervention among FOYC+CImPACT youth. These differences were also apparent among the subset of males and the subset of females. The fact that differences were not present at baseline and that these postintervention effects did not differ by gender indicates an intervention rather than a sampling (randomization) effect.       Table 3 demonstrates substantial increases in six of the nine behaviors with age among both the FOYC+CImPACT and the WW+GFI youth; being suspended, smoking cigarettes and use of cocaine did not increase. Carrying a gun as a weapon and being involved in a physical fight was significantly higher among FOYC+CImPACT males, raising the possibility that the intervention may have in some way increased this risk behavior.

AIDS Research and Treatment
Because it appeared that the significant increases in risk behaviors seen among the FOYC+CImPACT group were largely attributable to differences in gender composition of the two intervention groups, although sexual initiation remained somewhat higher overall even when controlling for gender, we explored the perception of our Bahamian local team members that the FOYC+CImPACT group was a higher risk group at baseline from an environmental effect (increased risk propensity). As noted above, the GLAT scores were somewhat lower among the FOYC+CImPACT schools compared to the WW+GFI schools. To explore the possibility that the GLAT scores might be a marker for future risk behaviors, in Table 4 we show the relationship between each of the 11 targeted risk behaviors and the GLAT score. With the exceptions of "suspended from school" and "consistent condom use" these relationships were all negative (e.g., the higher the GLAT score, the lower the frequency of risk behavior). Other than "suspended from school," "alcohol use," and "consistent condom use," across 36 months of followup, these correlations achieved statistical significance for at least one of the follow-up periods. For "ever had sex," these negative correlations were significant at all assessments. That is, regardless of intervention condition, youth in schools with lower GLAT scores were more likely to initiate sex. The negative correlations between the GLAT score and "sold or delivered drugs" and for carrying both guns and knives to "use as a weapon" were statistically significant. Table 4 also demonstrated the negative correlations between GLAT scores and perceptions of relatives and neighbor's involvement in four risk behaviors (using alcohol, marijuana and cocaine and selling drugs) at baseline, and each followup interval, with the exception of perceived frequency of relatives drinking alcohol. After baseline through 36 months followup, the majority of these correlations were significant. Therefore, in Table 5 we adjusted the rates for each of the eight behaviors which were significantly associated (in all cases negatively) at one or more assessment periods with the English GLAT scores and (gender for the overall group). The three behaviors without significant associations were adjusted without using GLAT scores. Following this adjustment "involved in a physical fight" (overall) and "carrying a gun as a weapon" remained higher (overall and among males) among FOYC+CImPACT youth. Sexual  initiation, carrying a knife, and selling drugs overall were no longer significantly higher among FOYC+CImPACT youth.

Discussion
The importance of using evidence-based approaches for disease prevention is well recognized; likewise, there is recognition of the need to reassess effectiveness of such interventions if they are significantly altered and/or applied in significantly different populations from that in which they were originally assessed [1]. School systems are often the site for "real-life" implementation of prevention programs targeting a broad audience [5,10]. For a variety of logistic and methodological reasons, randomization will frequently be conducted at the level of the school and, as such, there may be relatively few units of randomization [8].
Because school populations are frequently geographic based, confounding due to possible homogeneity of underlying risk factors for the outcomes of interest may be problematic.
If the outcomes and/or risk factors are relatively prevalent at baseline, adequacy of the randomization process could be assessed and/or baseline differences controlled for in the assessment of intervention impact. However, when the prevalence of outcomes of interest and/or known risk factors are low, baseline differences, which may impact outcomes, could be undetected. In such a case, utilization of available proxy measures would be of great importance in interpreting the results of such analyses. Likewise, if gender distribution is unequal across intervention groups and the outcomes of interest are of low frequency at baseline before the intervention, it may be difficult to disentangle the effects of gender and intervention assignment subsequently. Analysis by gender may be necessary to reveal actual (versus apparent) intervention impacts. In this study we show that both unequal gender distribution and inequality in baseline risk accounted for appearance of higher rates of sexual initiation and violence in the intervention group. By contrast, the higher rates of condom use, knowledge, and efficacy beliefs among FOYC+CImPACT youth appear to be true intervention effects.
In the present study, members of the local research team, familiar with the populations attending the elementary schools participating in the intervention evaluation, expressed concerns that the randomization process may have resulted in groups of unequal risk for the outcomes of interest (sexual risk behavior). However, sexual risk behavior and other potentially covarying risk behaviors were low at baseline and did not differ by intervention group at baseline. Overtime, with increased prevalence of risk behaviors, significant differences did emerge between the groups with regard to truancy, violence-related, and sexual risk behaviors. However, these differences emerged during the postintervention period.
Being aware of the correlation between academic achievement and risk behaviors in other studies, [33,34] we explored the availability of such data at the level of the school, differences at the level of the schools, and association with risk behaviors. We found that a nationally administered academic examination (GLAT) was available for all schools, that the distribution of scores was not equivalent among schools in the three intervention groups and that the scores were highly (inversely) correlated with eight of the 11 risk behaviors, including one of the violencerelated (gun-carrying) and sexual risk behaviors (initiation of sex) that did differ. Only three of the 11 risk behaviors we explored were not associated with the GLAT scores: having been suspended from school, failure to use a condom, and having drunk an alcoholic beverage. In the Bahamas the criteria for suspension varies greatly between schools as it is completely under the jurisdiction of the school principal or the viceprincipal (depending on the individual school). Therefore, behaviors, which might lead to suspension, differ greatly between schools and would vary over time as new principals are rotated through the school system. Thus, the practice does not reflect the local environment of the school or the neighborhood in which the school is located (personal communication, authors, L.C. Deveaux and S. Lunn). The relationship between failure to use a condom and other risk behaviors has been inconsistent across the age span and by culture [31,51,52]. Failure to use a condom is a complex behavior, first requiring that a youth engage in sexual intercourse (already a risk behavior) as well as negotiating several practical, personal, and social negotiations (obtaining a condom, feeling comfortable using a condom and convincing a partner to use a condom) [53]; arguably, it might be expected that such a complex behavior would not have a direct relationship with a general proxy of risk such as a lower GLAT score. Finally, unlike condomn use and school suspension, although there was an inverse correlation between GLAT scores and alcohol use, it did not reach statistical significance, perhaps reflecting the cultural acceptance of alcohol throughout the social stratum in the Bahamas.
While it is possible that there is a direct impact of facility with language and/or academic achievement on risk behavior, it is perhaps more plausible that this score serves as an early identifier of risk behaviors before they appear. The 2009 Centers for Disease Control Youth Health Risk Behavior Survey confirms that students with higher academic achievement are significantly less likely to have carried a weapon, smoked cigarettes, consumed alcohol, or engaged in sex than their classmates with lower achievement (http://www.cdc.gov/HealthyYouth/health and academics/ pdf/health risk behaviors.pdf).
Standardized assessments of academic progress are widely available and should be considered in situations such as that encountered here in which there is reason to believe that intervention groups may not be equivalent at baseline despite randomization. This study also speaks to the importance of the knowledge of local circumstances by the local research team. Their knowledge of the community environment should be seriously considered.