Linear Mixed Modeling of CD4 Cell Counts of HIV-Infected Children Treated with Antiretroviral Therapy

Background . Human immunodeﬁciency virus (HIV) is a major health problem in the world, and failure to implement prevention programs results in an increased number of infections among newborns. The goal of this study was to investigate the evolution and determinants of cluster of diﬀerentiation four (CD4) cell count among HIV-infected children who were under antiretroviral therapy (ART). Methods . We follow up a cohort of 201 children aged under ﬁfteen years from October 2013 to March 2017 at Adama Hospital in Ethiopia. To get insight into the data, exploratory data analysis was performed on the change in the longitudinal CD4 cell count. Results . At the baseline, the average number of CD4 cell counts was 468.5 cells/mm 3 of blood with a standard deviation of 319.11 cells/mm 3 . Here, we employed the random intercept and the random slope linear mixed-eﬀects model to analyze the data. Among predictor variables, observation time, baseline age, WHO clinical stage, the history of tuberculosis (TB), and functional status were determinant factors for the mean change in the square root of the CD4 cell count. Conclusions . The ﬁnding revealed that the change in the square root of the CD4 cell count increases with an increment of age at diagnosis. Regarding WHO clinical stages of patients, those who were in stage III and stage IV of the HIV/AIDs disease stages relatively had lower CD4 cell counts than stage I patients. This shows the change in the square root of CD4 cell counts of stage III and stage IV patients was 6.43 and 9.28 times lower than stage I patients, respectively. Similarly, we noticed that observation time, the history of TB, and functional status were signiﬁcantly associated with the mean change in the square root of the CD4 cell count.


Introduction
HIV is a major health problem in the world, and failure to implement prevention programs results in an increased number of infections among newborns. HIV-infected children should start ART to reduce AIDs-related morbidity and mortality or to improve their survival time [1]. According to United States Agency for International Development (UNAIDS) [2], an estimated 1.8 million children globally were living with HIV of which 1.18 million are in sub-Saharan Africa. As per the same estimate, there are 180,000 new infected children globally with an estimate of 108,000 (60%) occurring in sub-Saharan Africa, and 111,000 children died due to AIDs and related illnesses globally followed by 72,000 (65%) children in sub-Saharan Africa. In Ethiopia, an estimate of 729,089 people live with HIV including 80,923 children less than 15 years. As per the same estimate, there are 21,606 new infections of which 1,276 (5.9%) are children under 15 years. Furthermore, the number of deaths due to AIDs-related illnesses for the same period was estimated to be 10,960 in the country and 1,924 (2.4%) were children less than 15 years [3].
As studies reported, children experience more rapid HIV disease progression making them highly susceptible to opportunistic infections and death [4]. Antiretroviral therapy (ART) can restore immune function and has enormously reduced morbidity and mortality among HIVinfected children. Due to the advent of ART, many HIVinfected children can survive to adolescence and adulthood [5]. Current revised WHO guideline recommends that all HIV-infected children should initiate ART irrespective of the clinical disease stage or degree of immune suppression [6]. Hence, one of the main interests in HIV clinical studies is the change in CD4 cell counts of patients who are receiving ART.
e statistical modeling has greatly contributed to identifying the predictors related to the change in the CD4 cell count of patients initiating ART. e objective of this study was to investigate the evolution and determinants of the CD4 cell count among HIV-infected children initiating ART at Adama Hospital in Ethiopia.

Cohort-Based Data
e data used in this study came from a total of 591 children aged under 15 years old cohort-based retrospective study by reviewing patients' ART charts and electronic databases at Adama Hospital from 2013 to 2017 in Ethiopia [7]. A sample of 201 children who have full records or a complete history during the study period was considered in this study, and the data were retrieved by physicians who were working in the hospital. ose who had no full records or an incomplete history during the study period were excluded from the analysis. Here, we distinguished cause-effect specific related factors from various literature reviews. us, the change in the longitudinal CD4 cell count per mm 3 of HIV-infected children treated with ART was considered as a response variable of the study and measured every six months, at the baseline (first diagnosis), during the first visit (after 6 months), second visit (after 12 months), third visit (after 18 months), fourth visit (after 24 months), fifth visit (after 30 months), and finally at the sixth visit (after 36 months). e predictor variables considered in this study were those likely to affect the CD4 cell count of HIV-infected children including the age of children, hemoglobin, weight, gender (male/female), primary caregiver, caregiver HIV status (positive/negative), status of tuberculosis (positive/negative), functional status (ambulatory/bedridden/working), WHO clinical stage (stage I/stage II/stage III/stage IV), type of ART, and BMI.

Methodology
We conducted exploratory data analysis to investigate various structures and patterns exhibited in the dataset. is consists of obtaining the summary statistics such as mean and variance for the CD4 cell count. Besides, the individual profile plots, mean structure, and variance structure plots were used to gain some insights into the data. While, the individual profile plots and the variance structure were used to gain insight into the variability in the data and to determine which random effects to be considered in the linear mixed model. Also, the mean structure was used to gain intuition on the time function that can be used to model the data.

Linear Mixed-Effect Model (LMM)
. LMM is the most frequently used random effects model in the context of continuous repeated measurements from longitudinal responses when the measurements are taken on the same or related subjects at different times; in both cases, the responses are likely to be correlated [8]. When modeling longitudinal data, our interest is to study the association between dependent variable and a set of explanatory variables [9]. In the LMM, we assume that the dependent variable is a linear function of independent variables with regression coefficients that vary randomly from one person to another. is variation among individuals arises because of unmeasured or hidden factors [10]. e term 'mixed' is used because the LMM includes both fixed and random effects [11]. e fixed part represents the mean response, while the random part is for the individual level responses. Hence, the LMM provide a general modeling framework for subject-specific random effects, assumed to follow a normal distribution and are included to account for the correlation [12,13]. Here, the dependent variable was taken on the same subject at different times with different baseline characteristics. e assumed model captures for the correlation of the CD4 cell count taken on the same subject at different timepoints. To formalize, let β be a p × 1 vector of unknown coefficients for the fixed effects part and X i be the n i × p design matrix of fixed predictors linking β to the set of longitudinal measurements of CD4 cell counts labeled as Y i . Let u i be a k × 1 vector of latent individual random effects and Z i denote a known n i × k design matrix values of the random factors linking u i to Y i .
.., u n and ε 1 , ..., ε n are independent, where Y i is the n i × 1 CD4 cell count for the i th children, and ε i distributed as N(0, Σ i ) is a vector of residual components, combining measurement error and serial correlation. en, u i is distributed as N(0, Ω), independent of each other. at is, cov(u i , ε i ) � 0. Furthermore, Σ i � δ 2 I ni is the n i × n i positive-definite variance-covariance matrix for the errors in subject i, where I ni denotes the n i × n i identity matrix. Among the commonly used variance covariance structure for random effects, compound symmetry, heterogeneous compound symmetry, first-order autoregressive, and unstructured were considered and compared. Furthermore, to select the model which appropriately fits the given data, Akaike's information criterion (AIC) and Bayesian information criterion (BIC) were used.

Exploratory Data
Analysis. Exploratory analysis of longitudinal data seeks to discover patterns of systematic variation across groups of children, as well as aspects of random variation that distinguish individual children. Table 1 displays the summary statistics of the longitudinal CD4 cell count of HIV-positive children in different follow-up months. During the follow-up periods, the size of the cohort varied between 201, 185, 137, 125, 115, 92, and 54, respectively, for the baseline (first diagnosis), first visit, second visit, third visit, fourth visit, fifth visit, and sixth visit. e number of the CD4 cell count of patients between follow-up periods was decreasing over time indicating that they leave the study due to several reasons including death, early withdrawals, lost to follow-up, and other reasons. It can be seen that the mean of the CD4 cell count of children increases with an increasing rate until 24 th month (i.e., 4 th visit time) and start decreasing afterwards. e same is true for their standard deviations, increasing until 12 th month (i.e., 2 nd visit time) and start decreasing slowly. e average number of baseline CD4 cells count was 468.50 cells per mm 3 with a standard deviation of 319.11 per mm 3 of blood, implying that children were at risk at baseline and the average CD4 cell count start increasing after initiation of ART. Figure 1 shows the individual profile plot of the longitudinal CD4 cell count of all study subjects (left) and twenty randomly selected HIV-infected children's (right) by the follow-up time. Hence, it can be seen that some trajectories were steeper while others were almost horizontal, indicating the possible variability in the slope and intercept of CD4 cell counts.

Individual Profile Plot.
e plot also provides information on variability between CD4 cell counts and shows there is a change in the CD4 cell count over time. Some children' CD4 cell counts increase with an increasing rate and decrease over time, and the other has the erratic CD4 cell count. It appears that there is a fluctuation in the CD4 cell count over time after they initiated ART, and the variability of the CD4 cell count seemed larger at the beginning and lower at the end (Figure 1 (right)). Hence, there is variability between children in terms of their CD4 cell count. So, this is an indication of including random effects for each child to capture this variability and to allow the CD4 cell count for children within the same child to be correlated.

Mean Profiles Plot.
e average evolution describes how the profile of many subpopulations (or the population as a whole) evolves. e results of this exploration will be useful to choose a fixed-effects structure for the linear mixed model. e average of the profiles increases over time by the sex group, the point at which the CD4 count was recorded. erefore, the approximately straight-line trajectories linear time effect in the mixed effect model could be good in fitting the data (Figure 2).

Variance Profiles Plot.
e variance of the CD4 cell count of children showed an irregular pattern over the follow-up period. It increases at some point and decreases at another point suggesting a nonconstant variation among children over the follow-up period. High variation was observed among male until 12 th month and higher for females afterwards. e variance of both genders increases at some point and decreases at another point and suggests there is no constant variation over time (Figure 3). erefore, because of the variability in the intercept and slope of trajectories ( Figure 1) and pattern observed in Figure 2, the mixed effect model could be the candidate model to fit the data.

Linear Mixed-Effect Model.
Before modeling, normality of the CD4 cell count was checked, and the data on the CD4 cell count appeared to be right skewed, and then, the square root transformation of the CD4 cell count was considered to normalize the data (result not displayed here). In fitting the linear mixed effect model, a series of covariance structures of the longitudinal CD4 cell counts of HIV-infected children were considered. From the possible covariance structures, compound symmetry, unstructured, first order autoregressive, and heterogeneous compound symmetry considered, first order autoregressive having the smallest AIC and BIC values (result not displayed here) was considered in the study. e random effect to be included in the linear mixedeffect model was compared based on model selection criteria, and the model with random intercept and random slope (model 2) has relatively lower AIC and BIC values than the random intercept (model 1) model and random intercept and quadratic slope model (model 3) (Table 2). en, a random intercept and slope model was used in the linear mixed effect model to predict the mean change of the square root of the CD4 cell count over time in addition to potential predictor variables considered in the study as results are depicted in Table 3.
We found that among potential predictors considered in the study, the age of children, observation time, WHO clinical stage of the disease, the history of TB, functional status of children, and the interaction effect of the followup time with the age and WHO clinical stage were significantly associated with the mean change in the square root of the CD4 cell count at 5% level of significance (Table 3). us, being TB positive was associated with lower in the square root of the CD4 cell count. When all the other predictor variables were controlled, the change in the square root of the CD4 cell count was 2.88 times lower for TB-positive children compared to TB-negative children.
is result is consistent with a study conducted by Marie-Quitterie et al. [14]. Regarding the age of children, the change in the square root of the CD4 cell count increases with an increment of age at diagnosis in agreement with a study by Marie-Quitterie et al. [14], and younger children have good potential for achieving high CD4 counts on ART [15]. Likewise, children at the stage of working functional  status have the higher square root of the CD4 cell count as compared to those who were bedridden and ambulatory. e change in the square root of the CD4 cell count was 1.13 times higher for children in working functional status compared to ambulatory and for the bedridden was 0.67 times lower compared to ambulatory. Regarding WHO clinical stages, the estimated coefficient for stage III and stage IV was negative and significantly different from zero indicating stage III and stage IV children had a lower CD4 cell count than stage I during the follow-up. Hence, for those children who are in stages III and IV, the change in the square root of the CD4 cell count is 6.43 and 9.28 times lower in the square root of the CD4 cell count compared to those who are in stage I, respectively. e findings are consistent with the study performed by Aboma et al. and Abdulbasit et al. [15,16].

Conclusions
Among predictor variables, observation time, baseline age, WHO clinical stage, the history of TB, and functional status were determinant factors for the mean change in the square root of the CD4 cell count. Late WHO clinical stages, being TB positive, being ambulatory, and being bedridden are indicators of the disease progression. erefore, children should need diagnosis and initiate ART early as per the recent WHO recommendation; HIV-infected children could better initiate ART treatment early in respective of the disease marker.

Data Availability
e dataset used to support the conclusions of this study is available from the corresponding author upon request.

Ethical Approval
e study was carried out after getting permission from the Statistics Department, Arba Minch University. In this regard, the official letter of cooperation referenced with stat/ 319/2011 was written to concerned bodies at Adama Hospital.
en, the Board of Ethical Approval Committees reviewed and approved the letter.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
e first author designed the study, analyzed the data, drafted the manuscript, and critically reviewed the article. All authors read and approved the final manuscript.  Advances in Public Health 5 of Bule Hora University, Ethiopia, for his initial contribution to this work. ey also extended their gratitude to staff in Adama Hospital who were working in the ARTunit, for their kind cooperation in providing all the data.