A Gibbs Sampler for the Multidimensional Item Response Model

Current procedures for estimating compensatory multidimensional item response theory MIRT models using Markov chain Monte Carlo MCMC techniques are inadequate in that they do not directly model the interrelationship between latent traits. This limits the implementation of the model in various applications and further prevents the development of other types of IRT models that offer advantages not realized in existing models. In view of this, an MCMC algorithm is proposed forMIRTmodels so that the actual latent structure is directlymodeled. It is demonstrated that the algorithm performs well in modeling parameters as well as intertrait correlations and that the MIRT model can be used to explore the relative importance of a latent trait in answering each test item.


Introduction
Item response theory IRT is a popular approach used for describing probabilistic relationships between correct responses on a set of test items and continuous latent traits see 1-4 . IRT models have also been used in other areas of applied mathematics and statistical research. Some examples include US Supreme Court decision-making processes 5 , alcohol disorder analysis 6-9 , nicotine dependency 10-12 , multiple-recapture population estimation 13 , psychiatric epidemiology 14-16 , longitudinal data analysis 17, 18 , latent regression models 19, 20 , and missing data analysis 21 . IRT has the advantage of allowing the inference of what the items and persons have on the responses to be modeled by distinct sets of parameters. As a result, a primary concern associated with IRT research has been on parameter estimation, which offers the basis for the theoretical advantages of IRT. Specifically, of concern are the statistical complexities that can often arise when item and person parameters are simultaneously estimated see 1, 22-24 . More recent attention has focused on fully Bayesian estimation where Markov chain Monte 2 ISRN Applied Mathematics Carlo MCMC simulation techniques are used e.g., 25,26 . Over the past decade, MCMC has been implemented in the context of IRT models where one latent trait is assumed e.g., 3,[27][28][29] as well as to models where multiple traits are considered e.g., 30-36 , for a thorough review on the historical and current developments of MCMC in terms of IRT, see 37 .
The compensatory multidimensional IRT MIRT; 38 model assumes that each item measures multiple latent traits. It differs from some other dichotomous models insofar as it has an additional source of model indeterminacy that creates difficulties when using MCMC. Some techniques have been developed to approach this problem by imposing a special structure that constrains the item slope parameters 30, 36, 39 . However, these approaches do not directly model the actual interrelation between the distinct latent traits and, thus, are limited in certain applications. In view of the above, the present aim is to derive an efficient MCMC algorithm via Gibbs sampling 40 that a obviates the additional source of model indeterminacy associated with the MIRT model and b directly models the underlying latent trait structure. The MIRT model considered herein is presented in normal ogive form as more complicated MCMC procedures would have to be adopted for the logistic form e.g., 3,28,35,36 . Further, given that parametric probability functions of correct responses are usually modeled by a normal ogive or a logistic function and noting that the logistic and normal ogive forms of the IRT models are essentially indistinguishable in terms of model fit or parameter estimates given proper scaling, see 41 , MCMC procedures for logistic models are not considered. The remainder of this paper is organized as follows. In Section 2, the two-parameter normal ogive 2PNO MIRT model is outlined. In Section 3, the Gibbs sampler is derived, and the prior specifications for the model parameters are described. Section 4 gives examples of implementing the Gibbs sampling algorithm in the context of simulated and real data to demonstrate the proposed methodology.

Preliminaries
The MIRT model is introduced by considering a test that consists of k dichotomous items with each measuring m latent traits. Let y y ij n×k denote a matrix of n responses to the k items where y ij 1 y ij 0 if the ith person answers the jth item correctly incorrectly for i 1, . . . , n and j 1, . . . , k.
Definition 2.1. The probability of the ith person obtaining a correct response on the jth item is defined for the 2PNO MIRT model as The vector θ i θ 1i , . . . , θ li , . . . , θ mi denotes latent trait parameters associated with the ith person, and the vector α j α 1j , . . . , α lj , . . . , α mj denotes nonnegative slope parameters where larger values of α lj have more influence on determining a success on the jth item. The intercept parameter β j denotes the location in the latent space where the jth item is maximally informative, and Φ denotes the unit normal cdf. The model in 2.1 is also referred to as a compensatory MIRT model 38 because a low level of θ li in one dimension can be compensated by a high level of θ li in another dimension.

Remark 2.2.
If the vector of slope parameters in 2.1 is such that α j 0, . . . , 0, α lj , 0, . . . , 0 , then the MIRT model reduces to the 2PNO multi-unidimensional model as where the test involves multiple parameters of θ li and where each item measures one of these latent variables see 31,32 . The difference between 2.1 and 2.2 is analogous to the distinction made between factor analysis and that with a rotation to achieve a simple structure 42 . As such, 2.2 can be viewed as a special case of 2.1 where each item measures only one of the several latent traits. Further, the two models differ in that 2.1 is exploratory whereas 2.2 is confirmatory in nature. The unidimensional IRT model, which has a systematic component form of α j θ i − β j , has a well-known identification problem in terms of location and scale invariance e.g., 43 . Common practices of resolving this problem are to impose some constraint on the item parameters, that is, α j 1 and β j 0, or select some specific values for the location and scale parameters for the prior normal distribution of θ i , for example, θ i ∼ N 0, 1 see, e.g., 3, 27-29, 43 . Further, Bafumi et al. 5 proposed using a parameter transformation to approach the identification problem in the context of unidimensional IRT models. More specifically, the model parameters are transformed using a normalization procedure after estimation is completed. Bafumi et al. 5 noted that this transformation procedure obviates the problem of elusive convergence that results from highly correlated samples.
In terms of the multi-unidimensional IRT model in 2.2 , Lee 31 extended Tsutakawa's 43 approach by adopting a constrained covariance matrix for the latent traits and modeling the constrained covariance matrix indirectly. Lee's 31 method not only solves the model indeterminacy problem, but also appropriately estimates the interrelationship between multiple latent traits see also 32, 44 . The more general MIRT model, as defined in 2.1 , involves a new source of model indeterminacy called rotational invariance and is statistically more complicated than the unidimensional or multi-unidimensional models. As such, a Gibbs sampler is subsequently derived based on the ideas suggested in 5, 31 to address the general MIRT model identification problems and to model the latent structure directly.
It is noted that in an effort to develop computer software, Sheng 45 has shown that the approaches based on 5, 31 are useful for the 2PNO additive MIRT model, whose systematic component for modeling y lij takes the form α 0lj θ 0i α lj θ li − β lj . The model assumes that each item measures two latent traits: θ 0i , a common latent trait that all items measure, and θ li , a latent trait that is specific for items in the lth subtest. The difference between the model in 45 and the general MIRT model presented herein is comparable to that between a bifactor model see 46 and a general factor analysis model. The two models assume different latent structures, and hence the approaches for resolving their model indeterminacies are not the same.

The Gibbs Sampler
The derivation of the Gibbs sampler associated with the MIRT model defined in 2.1 begins by considering a multivariate distribution for θ i and a linear transformation on it, which will be based on the following definitions.

ISRN Applied Mathematics
Definition 3.1. Let θ i ∼ N m 0, P , where P is a constrained covariance matrix or a correlation matrix, with 1 s on the diagonal and with correlations ρ st between θ si and θ ti on the offdiagonal.
where D is an m × m diagonal matrix. Note that this variance-correlation decomposition of Σ 47 makes the interpretation easier 48 and is essential for modeling the correlation matrix indirectly while solving the model indeterminacy in the context of the MIRT model. From Definitions 3.1 and 3.2, it can be shown that where P can be transformed from Σ using for s / t. To obviate the identification problem associated with the unconstrained parameters, let θ * i be related with the item parameters α * j and β * j so that the likelihoods are preserved given where the item parameters α * j and β * j will have to be constrained such that j α * lj 1 and j β * j 0. This leads us to the following proposition.
Proof. It follows from 3.1 that θ * i Dθ i μ, and thus, substituting Dθ i μ into 3.3 gives Using 3.5 , we can subsequently derive To implement Gibbs sampling for the MIRT model in 2.1 , a latent variable Z is introduced such that Z ij ∼ N η ij , 1 see, e.g., 27, 49 . Further, from Definition 3.1, we assume that θ i ∼ N m 0, P to ensure unique scaling for θ, which precludes the identification problem associated with such models see 45 . Furthermore, for the unconstrained covariance matrix Σ, we assume that p Σ |Σ| − m 2 /2 . Thus, if ξ j α j , β j with assumed prior distributions, then the joint posterior distribution of θ, ξ, Z, Σ is 1 sampling of the augmented parameters from 2 sampling of the latent variable person parameters θ i from 3 sampling of the item parameters ξ j from where x θ, −1 , assuming uniform priors α lj > 0 and p β j ∝ 1, or from where μ ξ μ α 1 , . . . , μ α m , μ β and Σ ξ diag σ 2 α 1 , . . . , σ 2 α m , σ 2 β , assuming conjugate normal priors α lj ∼ N 0,∞ μ α l , σ 2 α l , β j ∼ N μ β , σ 2 β , Thus, with initial starting values of θ 0 , ξ 0 , and P 0 , the observations i.e., Z , θ , ξ , Σ , and P can be drawn or transformed iteratively from 3.10 , 3.11 , 3.12 , 3.14 , and 3.2 or 3.13 in lieu of 3.12 , respectively. This iterative process continues for a sufficient number of samples after the posterior distributions reach stationarity i.e., a phase commonly referred to as burn-in . The posterior means of all the samples collected after the burn-in stage are considered to be estimates of the model parameters θ, ξ and the hyperparameter P .

Numerical Examples
To demonstrate the methodology presented above, the proposed Gibbs sampler was implemented using both simulated and real data. In terms of simulated data, tests that measure two latent traits were considered. In particular, three 1000 × 18 i.e., m 2, n 1000, and k 18 dichotomous data matrices were simulated from the 2PNO MIRT model where the population correlation between the two latent traits was set to ρ θ 1i ,θ 2i 0.2, 0.4, 0.6, respectively. The item parameters were generated randomly from uniform distributions so that α lj ∼ U 0, 2 , β j ∼ U −2, 2 . Gibbs sampling was subsequently implemented to recover the model parameters assuming informative normal i.e., μ α 1 μ α 2 μ β 0 and σ 2 1 or uniform priors for ξ j . Convergence was evaluated using the Gelman and Rubin 52 R statistic for each item parameter. While the usual practice is to use multiple Markov chains from different starting points, a single chain can also be divided into subchains so that convergence is assessed by comparing the between and within subchain variances see 53 . In view of the fact that a single chain is more economical in the number of iterations needed, the latter approach was adopted. The posterior estimates of item parameters α 1 , α 2 , β , the intertrait correlation hyperparameter, and the associated Gelman-Rubin R statistics were obtained and are listed in Tables 1, 2, and 3 note that ρ θ 1i ,θ 2i is denoted as ρ 12 in these tables .
The Gelman-Rubin R statistic provides a numerical measure for assessing convergence for each item parameter. With values close to 1, it is determined that in the implementation of the Gibbs sampler, Markov chains reached stationarity with a run length of 10,000 iterations and a burn-in period of 5,000 iterations. The posterior estimates of the item parameters as    well as the intertrait correlation hyperparameter are fairly close to the specified parameters, suggesting that the algorithm performs well in recovering these parameters when the latent dimensions have a low to medium correlation. Further, the two sets of posterior estimates, resulting from different prior distributions, differ only slightly from each other, signifying that the posterior estimates are not sensitive to the choice of noninformative or informative priors for the slope and intercept parameters.
In the context of real data, a subset of the College Basic Academic Subjects Examination CBASE; 54 English data was used to demonstrate the methodology. Specifically, these data contain independent binary responses of 1,200 college students to 41 multiple-choice items. The English test is further organized to have two subtests, namely, reading and writing, so that 25 items are in the reading subtest and 16 are in the writing subtest. It is noted that the test was designed in such a manner that it conforms to the multi-unidimensional model, as each item measures one of the two latent traits. However, one may use the more general MIRT model to explore the latent structure, and in particular, to assess individual test items i.e., to determine if the trait mainly involved in answering each item agrees with the one that it is supposed to measure . This can be accomplished by examining the estimated slope parameters, as a larger α lj corresponds to a latent dimension that is more important in determining a person's success on the item. Hence, assuming uniform priors for ξ j , Gibbs sampling was implemented to fit the MIRT model to the CBASE data with a run length of 10,000 iterations and a burn-in period of 5,000, which was sufficient for the chains to converge. An examination of the posterior estimates of α shown in Table 4 suggests that all 16 items in the writing subtest relies on the second dimension writing more than the first dimension reading. However, some items in the reading subtest, such as items 17, 19-26, 28, and 30, require further attention and modification, as they do not seem to measure mainly reading as the rest of the items do.
In summary, the proposed MCMC algorithm provides computationally efficient and accurate estimation in the context of both simulated and real data examples. Not only does the algorithm appropriately model parameters, but also the algorithm efficiently models the intertrait correlations for the compensatory MIRT model, which provides an exploratory approach for examining the latent structure of a test and detecting items that do not measure the trait they are designed to measure.

Concluding Remarks
The MCMC algorithm presented in this paper offers solutions for directly modeling the underlying structure of IRT models with multiple continuous latent traits. The algorithm works well when the actual intertrait correlation is low to moderate less than 0.8 , as a high correlation tends to result in high collinearity, which makes it difficult to distinguish among multiple latent traits and estimate them. With model parameters being accurately estimated, the compensatory MIRT model can be used to explore the relative importance of a latent trait in answering each test item. This is particularly useful when the underlying structure is not known, or when it is desirable to confirm the structure by examining the performance of individual items.