Objective Outcome Evaluation of the Project P.A.T.H.S. in Hong Kong: Findings Based on Individual Growth Curve Models

The Tier 1 Program of the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) is a curricular-based program that attempts to promote positive youth development in Hong Kong. In the second year of the Full Implementation Phase, 20 experimental schools (n = 2,784 students) and 23 control schools (n = 3,401 students) participated in a randomized group trial. Analyses based on linear mixed models via SPSS showed that participants in the experimental schools displayed better positive youth development than did participants in the control schools based on different indicators derived from the Chinese Positive Youth Development Scale. Differences between experimental and control participants were also found when students who joined the Tier 1 Program and perceived the program to be beneficial were employed as participants of the experimental schools.


INTRODUCTION
The Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes) is a youth enhancement program that attempts to promote holistic youth development in Hong Kong [1]. There are two tiers of programs (Tier 1 and Tier 2) in this project. The Tier 1 Program is a universal positive youth development program based on 15 positive youth development constructs [2] in which students in Secondary 1 to 3 take part. Since the inception of the Project, different evaluation strategies have been employed to evaluate the project [3,4,5]. Generally speaking, different stakeholders had positive perceptions of the program and there is support for the effectiveness of the program.
As far as objective outcome evaluation is concerned, several studies have shown that students who participated in the project showed better development than did the control participants. Utilizing a preexperimental design, Shek [6] collected pre-and post-test data, utilizing the Chinese Positive Youth Development Scale (CPYDS), from 546 students participating in the 20-h Tier 1 Program of the Project P.A.T.H.S. Results showed that there were positive changes in the program participants in many measures of positive youth development. Although there was some increase in problem behavior in some areas, adolescent problem behavior was generally stable.
Based on the first two waves of data collected in a randomized group trial, Shek et al. [7] carried out analyses of covariance and linear mixed models, controlling for differences between the experimental and control groups in terms of pretest scores, personal variables, and random effects of schools. Results showed that participants in the experimental schools had significantly higher positive youth development levels than did participants in the control schools at post-test based on different indicators derived from the CPYDS. Based on the first four waves of data collected in the first 2 years of the Full Implementation Phase, Shek [8] found similar results by using analyses of covariance. In addition, the students in the experimental schools also displayed a lower level of delinquency, but better school adjustment than did students in the control schools.
Although the above findings provide support for the effectiveness of the Tier 1 Program of the Project P.A.T.H.S., it is noteworthy that ANOVAs and ANCOVAs (i.e., general linear models) were mainly used in the related studies. Besides the conventional methods based on MANCOVAs (e.g., [9]), some advanced techniques, including hierarchical linear modeling and latent growth curve modeling, have been developed in the past few decades [10,11]. In their review of different strategies for analyzing longitudinal data, Bijleveld et al. [12] reviewed nine methods to analyze longitudinal data. Hser et al. [13] proposed several methods of analyzing long-term treatment effects, including structural equation model, autoregressive panel model, multilevel/hierarchical linear model, latent growth curve model, survival/event history analysis, Markov model/(latent) transition model, and time-series analysis. While some of these models are compatible with each other, some of them are not. For example, there are studies that attempt to compare the multilevel model and latent growth curve model [14,15]. Among these strategies, hierarchical linear modeling is a strategy that is commonly used by researchers to examine changes in the program participants over time. In the present analyses, hierarchical linear modeling based on SPSS (Statistical Package for Social Sciences) was used primarily to examine the treatment effects over time.

Participants and Procedures
Shek and associates [7] described the procedures and criteria for recruiting the initial 24 experimental schools (one school dropped out after Wave 1 and three schools withdrew after Wave 2) and 24 control schools in Year 1, during which the Waves 1 and 2 data were collected from Secondary 1 students. In Year 2, Waves 3 and 4 data were collected from the same cohort promoted to Secondary 2, with 20 experimental schools and 23 control schools. The number of students who joined the experimental group and control group in Years 1 and 2, and the number of completed questionnaires collected can be seen in Table 1.
At pre-and post-test, the purpose of the study was mentioned, and confidentiality of the data collected was repeatedly emphasized to all students in attendance on the day of testing. Parental and student consent was obtained prior to data collection. All participants responded to all scales in the questionnaire in a self-administration format. Adequate time was provided for the participants to complete the questionnaire. A trained research assistant was present throughout the administration process.

Instruments
Consistent with the procedures used in Year 1, the participants were invited to respond to a questionnaire that comprised different measures of youth development at pretest (i.e., before the program began) and post-test (i.e., after the program ended). The following measures were used.

Chinese Positive Youth Development Scale (CPYDS)
Based on the analyses conducted in Year 1, the item composition of the 15 subscales of the CPYDS is listed as follows (pretest refers to the Wave 1 data and post-test refers to the Wave 4 data): 0.94 and 0.94 at pre-and post-test) assessing psychosocial competence and strengths was used (i.e., resilience, social competence, emotional competence, cognitive competence, behavioral competence, moral competence, self-determination, self-efficacy, beliefs in the future, and clear and positive identity). Third, based on conceptual analyses of the items, one key item was derived for each domain, which resulted in a 15-item key measure (KEY15:  = 0.90 and 0.90 at pre-and post-test). Fourth, based on item analysis, a 36-item key measure was derived (KEY36:  = 0.96 and 0.96 at pre-and post-test). Shek and Ma [16] also showed that the 15 subscales in the CPYDS could be further reduced to four dimensions, including cognitive-behavioral competencies, prosocial attributes, positive identity, and general positive youth development qualities.

School Adjustment Measures (SA)
Three items were used to assess the school adjustment of the participants. The first item assessed a respondent's perception of his or her academic performance when compared with schoolmates in the same grade. The respondents were asked to rate "Best", "Better than usual", "Ordinary", "Worse than usual", or "Worst" in this item. The second item assessed the respondent's satisfaction with his or her academic performance. The respondents were asked to rate "Very satisfied", "Satisfied", "Average", "Dissatisfied", or "Very dissatisfied" in this item. The final item assessed the respondent's perception of his or her conduct. The respondents were asked to rate "Very good", "Good", "Average", "Poor", or "Very poor" in this item. Previous research findings showed that these three items and the related scale were temporally stable and valid [9]. The present findings showed that the measure was internally consistent at pretest ( = 0.72) and post-test ( = 0.73) in Year 2.

Subjective Outcomes Scale (SOS)
Twenty items were used to assess the participant's satisfaction with the program and instructor, as well as their perceived benefits of the program at post-test (i.e., Wave 2 data). The response options included "Strongly disagree", "Moderately disagree", "Slightly disagree", "Slightly agree", "Moderately agree" and "Strongly agree". Reliability analysis showed that this measure was reliable ( = 0.97). Item 20 of this scale is "Overall speaking, the program was beneficial to my development".

Data Analytic Strategies
The data were analyzed by linear mixed models (LMM) via SPSS with maximum likelihood estimation [17,18,19]. Basically, individual growth curves are developed in LMM and systematic differences in groups (e.g., experimental group vs. control group) in the rate of acceleration are explored. In this paper, the intercept (initial status) as well as linear and quadratic coefficients for statistical significance and for group differences in rate of change were tested.

RESULTS
Using schools as the units of analysis, results showed that the 20 experimental schools and 23 control schools did not differ in the banding of the schools, districts, religion, gender of the students, and source of funding. For the personal characteristics of the participants, results showed that there were no statistically significant differences between the two groups in their background sociodemographic characteristics (p > 0.05 in all cases), except age. In short, except that the mean age of the control group was higher than that of the experimental group, the background characteristics of the experimental schools and control schools were highly comparable at Wave 1. The growth curve model findings based on several outcome variables are presented in Table 2. Results showed that there were significant interactions of group and waves for KEY36, CBC (cognitivebehavioral competencies, which is a second-order factor), PID (positive identity, which is a second-order factor), and academic adjustment. The interaction effects were then plotted graphically to assist the interpretation of findings. As revealed by Figs. 1-4, the findings showed that the experimental participants dropped more slowly than did the control participants. Further analyses based on experimental participants who found the program to be beneficial (responding in the direction of agreement to item 20) vs. control participants similarly showed that the experimental participants generally performed better than control participants in terms of the global positive youth development indicators (Figs. 5-8).

DISCUSSION
The purpose of this paper is to report objective outcome evaluation findings regarding the effectiveness of a positive youth development program (Project P.A.T.H.S.) in Hong Kong using individual growth curve modeling techniques. This is the first known scientific study to adopt a randomized group trial design using data spanning over 2 years to evaluate a positive youth development program based on a curricular approach in different Chinese communities.
The findings generally showed that compared with participants in the control group, participants in the experimental schools performed better in different indicators of positive youth development. First, the findings revealed that experimental participants performed better than control participants in the areas of psychosocial competencies. For example, findings based on CBC (cognitive-behavioral competencies second-order factor) suggest that the experimental subjects displayed higher scores on cognitive competence, behavioral competence, and self-determination. Second, the experimental subjects performed better than did the control subjects in PID (positive identity second-order factor). Finally, the experimental subjects had a slower decline in school adjustment than did participants in the control schools.
Further analyses based on the experimental subjects who found the program to be beneficial to their development only (i.e., response to SOS-20 in the positive direction) showed similar, but stronger results. Besides the findings that the experimental participants performed better than did control participants in KEY15 and KEY36, the decline in overall positive youth development was slower in the experimental participants than in the control participants in CPYDS-10 (global measure of psychosocial competence and strengths, which includes resilience, social competence, emotional competence, cognitive competence, behavioral competence, moral competence, self-determination, self-efficacy, beliefs in the future, and clear and positive identity) and CPYDS-12 (all subscales, excluding behavioral competence, self-determination, and prosocial norms).
The above findings basically reinforce the previous objective outcome evaluation findings based on general linear models [7,8]. In conjunction with the previous findings based on objective outcome evaluation, subjective outcome evaluation, qualitative evaluation via focus groups, qualitative evaluation via diaries, process evaluation, and interim evaluation, the existing evaluation findings from the Project P.A.T.H.S. basically suggest that the program is an effective one. In view of the paucity of outcome studies in Hong Kong, the present study contributes to evidence-based youth work in Hong Kong [20].
Nevertheless, one interesting observation is that there was a general decline in positive youth developmental attributes across time. While this finding is consistent with the finding that adolescent mental health deteriorated across time [21], the decline in "perceived" psychosocial competence across time is an enigma deserving further investigation.

FIGURE 5
Growth trajectories in the experimental participants participating in the Tier 1 Program (and who regarded the program as beneficial) and control participants using KEY15 as an outcome indicator.

FIGURE 6.
Growth trajectories in the experimental participants participating in the Tier 1 Program (and who regarded the program as beneficial) and control participants using KEY36 as an outcome indicator.