D-Optimal Design for a Causal Structure for Completely Randomized and Random Blocked Experiments

Most experimental design literature on causal inference focuses on establishing a causal relationship between variables, but there is no literature on how to identify a design that results in the optimal parameter estimates for a structural equation model (SEM). In this research, search algorithms are used to produce a D-optimal design for a SEM for three-stage least squares and full information maximum likelihood estimators. en, a D-optimal design for the estimate of the model parameters of a mixed-eects SEM is obtained. e eciency of each of the D-optimal designs for SEMs is compared with univariate optimal and uniform designs. In each case, the causal relationship changed the optimal designs dramatically and the new D-optimal designs were more ecient.


Introduction
Most experimental design literature on causal inference focuses on establishing a causal relationship between variables, but there is no literature on how to identify a design that results in the optimal parameter estimates for a structural equation model (SEM). is work was motivated by an agricultural research study by Milander and colleagues [1] where a SEM was proposed to determine the in uence of seeding rate on the yield components of waxy maize and to better determine the interrelationship between yield and yield components by estimating the direct e ect of the path model based on the biological understanding of the interrelationship among the endogenous and exogenous variables. e endogenous variables in the proposed system as described in Figure 1 were the yield components rows per ear, ear length, kernels per row, kernels per ear, grain yield, ear circumference, and kernel weight. Seeding rate was the only exogenous variable. e goodness-of-t parameters for the model were poor, so the data were reanalyzed to obtain a model that better t the data, with the resulting model as described in Figure 2.
However, it did not give any insight into the interrelationship among many of the variables, which raised the question of whether or not the design was adequate for structural equation modeling. e study was conducted over a period of three years and the original model was known before the data were collected [1]. Ideally, the researchers should have designed an experiment which would have allowed for the most precise estimates of the parameters, thus leading to more precise inferential statistics. Rather than uniformly applying the seeding rate levels across the experiment, the optimal design would have provided the researchers the number of replicates for each seeding rate level and which of the seeding rate levels should have been replicated more. However, they did not have the tools to obtain an optimal design for the proposed model nor were they able to take advantage of the information that was collected in the rst year to adjust the design for subsequent years. ere simply was no literature to support such an objective.
Historically, what drove the growth of optimal design theory was the need to have a design that is optimal for all parameters that need to be estimated [2,3]. Technological advancements made it possible to develop the search algorithms that are needed to obtain optimal designs [4][5][6][7][8].
e new growth in the application of experimental design came about because of the lack of theoretical results in the optimal design of blocked and split-plot experiments, which made the computerized design algorithms popular and led to three proposals for point-exchange algorithms for constructing D-optimal blocked designs [9][10][11][12][13][14][15].
Traditional statistical analyses focus on univariate linear models, which have limitations because most random processes involve multiple dependent variables.
is, in turn, led to the development of the multivariate linear model that does not address the causal relationships among the dependent (endogenous) variables, which is not the case with structural equation modeling [16,17]. SEMs allow qualitative cause-effect information to be combined with statistical data and provide quantitative assessment of the cause-effect relationships among, and within, the endogenous and exogenous variables [17,18].
Even though there have been many algorithms developed for univariate optimal designs and univariate mixed optimal designs, there is very limited literature about multivariate optimal design with examples like the work of Soumaya and fellow authors [19] and Schwabe [20]. ere is literature in experimental design in structural equation modeling but not in optimal design. e research in this area so far is to establish causal relationships (direction of the arrows) among the variables [21,22]. is research extends to counterfactuals and experimental design where it is used to establish causal relationships [23]. However, no research exists to obtain an optimal design to estimate the parameters of a SEM. e gap in literature demonstrates the need for this current research, which comprises three objectives. For a given causal structure, the first objective is to obtain a Doptimal design that allows the most precise estimates of the endogenous and exogenous parameters of that model. en, for a given mixed causal structure with random blocks or split-plots, the second objective is to obtain a D-optimal design for estimating the parameters of the model. Finally, the efficiency of each of the D-optimal designs for causal structures will be compared with the optimal designs for the univariate case.

Methods
For the univariate case, the linear model is y � X β + ε, where y is an n × 1 vector of observations, X is an n × q design matrix of rank q, β is a q × 1 vector of unknown coefficients that are estimable, and ε is an n × 1 identically independently normally distributed vector with E(ε) � 0 and Var(ε) � σ 2 I (n) where I (n) is the identity matrix with rank n. e least-squares estimate of β is β Our objective is to choose a design "X" such that max X |Var(β ∧ )| � max X |X ′ X| [7,8]. Such a design "X" is called a D-optimal design. e same philosophy would be followed for a mixed model. e model for the data that comes from experiments with random blocks or split-plots is y � X β +Z u + ε where y, X, β, and ε are defined above and u is a random vector with E(u) � 0 and Var(u) � σ 2 e objective becomes to choose the design matrix "X" such that max X |X ′ V − 1 X|. However, V is unknown. is presents a new challenge and shows that the more complicated the model, the more challenges can arise when choosing a D-optimal design. e first solution to the above challenge is to assume that V � Var(y) � σ 2 ε I (n) + σ 2 u ZZ ′ is known by making assumptions about σ 2 ε and σ 2 u . e second solution is to [14,15,24]. is allows us to rewrite as the function to optimize a univariate linear model. It can be assumed without loss of generality σ 2 ε � 1 , thus |X ′ V − 1 X| can be expressed in terms of variance ratio η or in terms of the correlation coefficient ρ. erefore, the D-optimal design for a blocked or a split-plot experiment depends on the variance ratio. However, Goos and Vandebroek [14,15] and Goos and Jones [13] argued that the dependence is minor and presented algorithms for constructing locally optimal designs for a given η or ρ. In some special cases, the optimal design is globally optimal and neither depends on the variance ratio nor the degree of correlation of which the most popular is the orthogonal design [14]. e previous method, or "traditional method" as it is referred to in literature, is based on optimization of |Var(β )| or | (X ′ V − 1 X)| and has two weaknesses [25]. e first weakness is that the D-optimal design depends on η or ρ and these values are not known prior to the experiment. e second weakness is that the optimal design focuses only on the fixed effect parameters and ignores the estimate of the random effect parameters. ese weaknesses could have severe consequences on the estimate of the variance components, which severely affects the inferential statistics on the fixed effect parameters. Mylona et al. [25] provided scenarios for optimal designs that were constructed using the traditional D-optimality criterion and, in these designs, the random effects σ 2 u and σ 2 ε are not estimable. Consequently, the two weaknesses led to a new composite D-optimality criterion. e new composite criterion was proposed by Mylona et al. [25]. e reason for calling it composite is because this criterion takes into consideration both the fixed effect and the random effect parameters.
However, D-optimality criterion and orthogonal designs are both dependent on an assumed model. If the assumed model incorrectly represents the interrelationships between the endogenous and exogenous variables, it may lead to an increased bias error.
us, in cases where there is no knowledge of the design specification, one of the most used designs is the uniform design in which the design points are scattered over the experimental region [26]. In comparison to D-optimal and orthogonal designs, uniform designs are less sensitive for model selection [27,28]. Consequently, uniform design does not require pre-experiment model specification [29]. Uniform design could be used to explore the relationship between variables.
For the same reasons that led to the development of algorithms to account for all parameters of interest in the univariate case, our objective is to develop algorithms for SEMs that consider both the endogenous and exogenous parameters. In structural equation modeling, estimation methods are more complex and there are a variety of class estimators. is research focuses on three-stage least squares (3SLS) and full information maximum likelihood (FIML) methods because they are the only full-system estimation methods where all parameters are estimated simultaneously.

D-Optimal Design for a Causal
Structure Model 3.1. e Information Matrix Approximation for 3SLS. In most systems, researchers are interested in more than one endogenous variable and these variables directly and indirectly affect each other. To model such a study, we use SEMs. To demonstrate the methodology and introduce the new approach for D-optimal designs, the following examples will be introduced. A simple example is shown in Figure 3 with two exogenous variables and two endogenous variables. Many phenomena can be modeled based on this SEM. For example, at the University of Washington Tacoma there are two pre-calculus pathways into Calculus I, a two-course sequence (TMATH 115 and TMATH 116) and a one-course sequence (TMATH 120). In this model, x 1 represents the grade in TMATH 115, y 1 represents the grade in TMATH 116, x 2 represents the grade in TMATH 120, and y 2 represents the Calculus I grade. If we are designing a study to estimate the parameters in Figure 3, our first objective would be to determine how many students should be enrolled in each pathway to obtain the best estimates for the three parameters c 11 , c 22 , and b 12 . ere are two parameters that need to be estimated through the first pathway but only one through the second pathway. In this example, assigning half of the sample size to each pathway may not be optimal because there are more parameters on the first pathway than the second. So, the fundamental question here is, how many students should be assigned to each pathway to obtain the optimal estimates for all parameters?
Journal of Probability and Statistics 3 We will focus on the model in Figure 3. In this centered model, the k th observation can be expressed mathematically as y k1 � c 11 x k1 + ε k1 and y k2 � b 12 y k1 + c 22 Multiply both sides by X ′ , use matrix notation, and use the Kronecker product to obtain 30]. en, to estimate δ * , use the Generalized Least Square [31,32], which is equivalent to the asymptotic information matrix for FIML [33].
us, the estimate of the covariance matrix would be e reason that we are interested in Var ) is because our objective is to obtain the design X that "minimizes" )] − 1 . us, our objective would be to obtain the D-optimal design "X" such that (2) In this paper, we assume that there are no constraints on the treatments or on the exogenous variables. However, if there are constraints on the treatments, such as if one of the treatments is limited, then the optimality criteria can be adjusted as is done in the univariate case, which is known as the optimal design of mixture experiments problem [13,[34][35][36][37].
Example 1. .Based on equation (2), the D-optimal design for the causal structure below ( Figure 4) assumes that the treatments are qualitative treatments for twenty observations and assumes that the true values for c 11 , c 22 , and b 12 are 8, 2, and 5, respectively Figure 4: In vector notation, the model can be expressed as y (1) � c 11 x (1) + ε (1) and y (2) (2) ∼ N(0 , Σ ⊗ I (20) ). In this example, y (1) is a 20 × 1 vector that includes the responses of the first endogenous variable, y (2) is a 20 × 1 vector that includes the responses of the second endogenous variable, x (1) is a 20 × 1 vector that indicates whether or not the first treatment is applied on the k th experimental unit as denoted by (x k1 � 1) or (x k1 � 0), x (2) is a 20 × 1 vector that indicates whether or not the second treatment is applied on the k th experimental unit as denoted by (x k2 � 1) or (x k2 � 0), ε (1) is a 20 × 1 vector of random residuals for the first endogenous variable, ε (2) is a 20 × 1 vector of random residuals for the second endogenous variable, 0 is a 40 × 1 vector of zeros, Σ � I (2) , and I (20) is a 20 × 20 identity matrix. To obtain a D-optimal design, the algorithm starts with multiple random designs in order to avoid local optimality. For this specific example, 50 initial random designs were used, each design with size 20. e initial designs were constructed randomly where each x kj value had a 50-50 chance of being a 1 or 0. To improve the design, the algorithm compares each point in the design with the candidate points (0, 0), (1, 0), (0, 1), (1, 1) { } and makes a simultaneous exchange with the candidate point that improves the optimality criterion from equation (2) the most. e exchanges will continue until no further improvement is achieved or the improvement is sufficiently small. is procedure will be repeated for each of the 50 initial designs.
e final designs will be compared. e design X that has the smallest Var ) from among the 50 final designs will be selected as the D-optimal design.
Using the previous algorithm, a D-optimal design for 20 observations for the model in Example 1 is given in Table 1. In the D-optimal design for the causal structure for the model in Figure 4, the points (1, 1) and (1, 0) were each replicated nine times, and the point (0, 1) was replicated twice. e point (0, 0) was not replicated, which is expected because the model was assumed to have no intercept. ) instead of the true parameters. One question that arises with the use of this approach is the robustness of the design. To address the sensitivity of the true values of the parameters on the design, simulation is used to obtain the optimal design based on the estimates of the parameters.
When there is no prior data available, a second approach is a Bayesian approach which assumes a prior distribution for the endogenous variables, which is similar to the approach that is used by Mylona et al. [25] and Goos and Mylona [38] to address the variance components optimal design for blocked or split-plot experiments. is approach allows for the uncertainty of the endogenous parameters. Using the Bayesian approach to obtain a 3SLS D-optimal design for a causal structure will be the subject of future work.
Optimal design theory is currently limited to univariate or multivariate applications where one of the biggest weaknesses of these approaches is that they ignore any possible causal structure among the endogenous variables. Because of the lack of theoretical results and the algorithms to produce optimal designs for a causal structure, one approach is to use univariate optimal designs for causal models, which leads to a loss of efficiency. To obtain a univariate D-optimal design for Example 1, it is assumed that the treatment will affect both endogenous variables to avoid the situation where there is an optimal design for each equation. In this case, we obtain a D-optimal design for both treatment combinations. An optimal design in the univariate case for two qualitative treatments (i.e.y � c 11 x (1) + Table 2 where the four possible candidate points are (0, 0), (0, 1), (1, 0), and (1, 1). e three candidate points (0, 1), (1, 0), and (1, 1) are replicated seven times, seven times, and six times, respectively. For the same reasons as in the D-optimal design for a causal structure, the point (0, 0) was not replicated since the intercept was not included in the model. e ratio of the determinants of the information matrices of the univariate versus the 3SLS designs and the Defficiency of the two designs are used to compare the optimality of the designs [9,10]. e results from Table 3 compare the efficiency of the univariate optimal design from Table 2 to the efficiency of the new D-optimal design from Tables 2 and 3.
When comparing the univariate optimal design to the 3SLS D-optimal design for a causal structure, there was an approximately 18% increase in the determinant of the asymptotic information matrix estimates. e univariate optimal design is about 94% as D-efficient as the 3SLS Doptimal design for a causal structure in all four cases.
One criticism over the comparison of the D-optimal designs obtained above is that they are based on the asymptotic information matrices, but the D-optimal designs were obtained for small samples sizes (20 observations). us, it is important to ensure that the results are consistent for small samples sizes, as well. To verify the comparison, we simulated data based on the causal structure D-optimal design and data based on the univariate optimal design 100,000 times each. We then estimated the parameters for both designs using the GLS estimate δ ∧ * 3SLS and calculated the covariance matrix of the parameter estimates for both designs, whose elements are cov(δ 3SLS . e determinants of the covariance matrices of the parameters estimates were computed. is process was repeated three times. e determinant of the covariance matrix of the parameters estimates for the 3SLS causal structure Doptimal design was consistently smaller than the determinant x (1) y (1) x (2) Journal of Probability and Statistics 5 of the covariance matrix of the parameters estimates for the univariate optimal design, as shown in Table 4. e results from Table 4 were consistent with the results for the comparison of the asymptotic information matrices. e 3SLS causal structure D-optimal design consistently produced a smaller determinant for the covariance matrix of the parameters estimates than the 3SLS univariate optimal design. Specifically, the determinant of the covariance matrix of the parameter estimates for the 3SLS causal structure D-optimal design was about 15%-19% smaller than the determinant of the covariance matrix of the parameter estimates for the 3SLS univariate optimal design. Based on those results, the 3SLS causal structure D-optimal design was 5%-7% more D-efficient than the univariate optimal design. It is important to note that whether the true parameters or their estimates are used, the 3SLS D-optimal design for a causal structure did not change.

3.2.
e Information Matrix Approximation for FIML. e full information maximum likelihood (FIML) is another estimation method for SEMs. In this section, we will discuss optimal designs based on the FIML estimators and will investigate the relationship between the optimal designs.
Durbin [33] proposed a transformation for the maximum likelihood equations that simplified the computations and made it easier to study the properties of FIML estimators and their advantages over 3SLS estimators. However, even with the special notations, the FIML information matrix is sophisticated and takes much more work to obtain than the 3SLS information matrix. Based on Durbin's [33] results, our objective is to obtain a design matrix "X" such that max X |M FIML | � max X |Q ′ GQ|. e derivation and the special notation are noted in Appendix A. e reason that both D-optimal designs are considered is because the 3SLS information matrix is easy to obtain, but FIML estimators have advantages over 3SLS estimators as discussed by Durbin [33].
As with the 3SLS D-optimal design, the objective would be to obtain the design matrix X that will minimize the estimate of the determinant of the covariance of the FIML estimates or maximize the determinant of the FIML information matrix estimate. e performance of the D-optimal design for a causal structure will then be compared to the univariate optimal design by computing |Var ∧ (δ ∧ * FIML )| through simulation. e design which consistently has the largest determinants of the information matrix would be the better design. In the following example, the objective will be to use the estimate of the information matrix to obtain the FIML optimal design.
For the causal structure in Example 1, the 3SLS D-optimal design is also the D-optimal design for the FIML estimates. Similar to 3SLS, the true parameters are unknown, so the FIML estimates are used to replace the true parameters for the three simulations. In practice, the information matrix is estimated based on the parameters estimates. is is similar to the approach of Goos and Vandebroek [14,15] and Goos and Jones [13] in the case of optimal design for  #  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  x 1  0  1  1  0  1  1  0  1  1  0  1  1  0  1  1  0  1  1  0  1  x 2  1  0  1  1  0  1  1  0  1  1  0  1  1  0  1  1  0  1 1 0   Journal of Probability and Statistics random blocks or split-plots where they suggest using the estimates of the variance parameters or a reasonable guess, arguing that the design minimally depends on those values. e results obtained for a D-optimal design for a causal structure in this research are consistent with those of Goos and Vandebroek [14,15] and Goos and Jones [13] where the optimal design did not change when replacing the true parameters with their estimates. e results shown in Table 5 show the efficiency of the Doptimal design from Table 1 as compared to the efficiency of the univariate optimal design from Table 2. e results for the FIML causal structure D-optimal design are similar to the results for the 3SLS causal structure D-optimal design. e optimal design for the univariate case was about 6% less D-efficient than the FIML design for the causal structure. e loss of efficiency demonstrates the importance of taking the endogenous parameters into account to obtain an optimal design to study the causal structure.
For both the 3SLS and FIML D-optimal designs, the goal of the three simulations is to show that the D-optimal design did not change even though the estimate of the parameters deviated from the true value. However, it may be problematic to base a conclusion on three simulations only; therefore, 300 data sets were simulated and then the parameters were estimated. ese parameters were used to obtain a D-optimal design based on the estimate of the parameters. In each of the 300 simulations, the D-optimal design did not change and was equivalent to the D-optimal design in Table 1.
Another way to demonstrate the robustness of the design is to incrementally change the value of the parameter(s) from the true value, then obtain a D-optimal design based on the value of the parameters, and finally compare the D-optimal design to that which was obtained based on the true value. For the model in Figure 4, the D-optimal design depends only on c 11 � 8 since W ∧ * includes only y (1) � c 11 x (1) + ε (1) . Table 1 is the D-optimal design for c 11 � 8. Now, we will change the value of c 11 incrementally based on the interval 8 ± 6 � [14,2]. We first start with 8 and then decrease c 11 incrementally by 0.1 (i.e.8, 7.9, 7.8, . . . , 2). en, we obtain the D-optimal design based on these values and check whether it is equivalent to the D-optimal design in Table 1. Similarly, we start with 8 and then increase c 11 incrementally by 0.1 (i. e.8, 8.1, 8.2, . . . , 14). en, again, we obtain the Doptimal design based on these values and check whether it is equivalent to the D-optimal design in Table 1. e D-optimal designs did not change even though the value of the parameters changed up to 75% from the true value. ese results are consistent with Goos and Vandebroek [14,15] and Goos and Jones [13] where they argued that the optimal design depended minimally on the specific parameter values.

D-Optimal Design for a Causal Structure Model with Random Blocks
In applied research, blocked designs are likely the most commonly used experimental designs since they can be used to account for variation attributable to sources other than treatments and can considerably improve the precision of the experiment. e first objective is to establish the blocked causal structure and then to obtain the 3SLS and FIML estimators for the endogenous and exogenous parameters, which will be used to estimate their information matrices. e second objective of this section is to use the algorithm described in Section 1.1 in Example 1 to obtain a D-optimal design for both the endogenous and exogenous parameters. e last objective is to compare the efficiency of the Doptimal design to the classical univariate mixed model optimal design.

3SLS D-Optimal Design for a Causal Structure with Random Blocks.
e model with random blocks is similar to the model for a completely randomized design except that the block effect needs to be added to every endogenous variable. erefore, for the i th endogenous variable as the dependent variable, the model can be written as.
⎞ ⎠ and all of the other terms have been previously defined. If the blocks have the same size, then Z � diag1 k ,1 k ,...,1 k � I b×b ⊗1 k , the elements of u (i) and ε (i) are assumed to be mutually independent and normally distributed with zero mean and variances σ 2 u (i) and σ 2 (i) , respectively.
Let W (i) � Y (i) X (i) and δ (i) � b (i) ′ c (i) ′ ′ . en, the model with random blocks can be rewritten as y (i) � W (i) δ (i) + Zu (i) + ε (i) . Use matrix notation and then the Kronecker product to rewrite the previous model as Using the same approach as described in Section 1.1, the asymptotic information matrix would be Μ 3SLS � Based on the information matrix for the 3SLS estimates for the causal structure D-optimal design with random blocks, our objective would be to use the algorithm described in Section 1.1 in Example 1 to obtain the design X such Since the information matrix depends on the unknown parameters through Σ ∧ and W ∧ * , we face the same predicament as in the case of the completely randomized D-optimal design for a causal structure where the parameters are Journal of Probability and Statistics unknown. e solutions that were proposed for the completely randomized D-optimal design for a causal structure in Section 1.1 are appropriate for the causal structure Doptimal design with random blocks. Other approaches are also possible. For example, in an experiment with random blocks that is conducted in multiple sequential stages, the data of one block (year, lab, day, etc.) could be utilized to obtain initial estimates that are then used to obtain the Doptimal design for the rest of the blocks (other years, other labs, other days, etc.). Requiring preliminary information is a standard part of univariate design. If there is no prior data, then a reasonable guess can be used similar to the approach of Goos and Vandebroek [14,15] and Goos and Jones [13]. In many experiments, multiple factors and limitations of the experimental material require the use of split-plot designs and/or incomplete block designs. Specifically, the way the factors are applied and the nature of the experimental material give rise to different sizes of experimental units and limits on block size. For this reason, an incomplete block design with three factors is used as an example in this section. In the next example, the objective is to use the asymptotic information matrix estimates to obtain a Doptimal design for a causal structure with four random blocks where the size of each block is four. Example 2. .Assume that there are three binary treatment factors (exogenous variables) with two endogenous variables for the causal structure given in Figure 5. Here, the x (i) , i � 1, 2, 3 each with a x kj value, had a 50-50 chance of being a 1 or − 1. Also, assume that the true values for c 11 , c 22 , c 31 , and b 12 are 8, 2, 3, and 5, respectively Figure 5: A D-optimal design for a causal structure with four random blocks with block size equal to four under the given assumptions can be found in Table 6.
e expected determinant of the information matrix is 5,222,400 and the previous design is not orthogonal. e expected determinant of the information matrix of the orthogonal design in Table 7 is 4,784,128. e orthogonal design is the optimal design for the univariate mixed model with three binary treatment factors and four blocks each block with size four.
One concern for both D-optimal and orthogonal designs is that they both require an assumed model structure. One popular alternative is the uniform design, which is "a space filling design" ( [39], p. 131) that uniformly distributes experimental points across the design domain without model pre-specification [39]. erefore, we will compare the information of the uniform design to the orthogonal design and the new D-optimal design. Using a uniform design table [39,40], a uniform design for 16 runs with 3 factors and an experimental cube of [− 1, 1] 3 is shown in Tables 7 and 8. e expected determinant of the information matrix for the uniform design is 28,612, which is significantly smaller than the determinants of the D-optimal and the orthogonal design information matrices.
e results show that the new D-optimal design was approximately 2.2% more D-efficient than the orthogonal design, whereas the D-optimal design was 72.8% more D-efficient than the uniform design. e D-optimal design performed significantly better than the uniform design but is still comparable to the orthogonal design. Although orthogonal designs are desirable because they are universally optimal for first-order linear models and are also robust for nonlinear models, they also come with concerns in that they sometimes do not exist and require a large sample size [25]. It is also important to keep in mind that in Example 2, there is only one endogenous parameter. e significance in the increase in efficiency of the determinant may not be easily identified through our simple example. In a more complex design, there will be more endogenous variables and if those variables are interconnected in a more sophisticated way, then the increase of the determinant of the information matrix would be amplified.
It is worth noting the differences between the 3SLS causal structure D-optimal design and the orthogonal design. First, the orthogonal design completely confounds the three-way interaction (x i1 x j2 x k3 ) with the blocks where x i1 is the i th element of x (1) , x j2 is the j th element of x (2) , and x k3 is the k th element of x (3) . e number of replicates for each candidate point is 2. However, this is not the case in the 3SLS causal structure D-optimal design where the two-way interaction of (x j2 x k3 ) is confounded with Blocks 1 and 3, the three-way interaction of (x i1 x j2 x k3 ) is confounded with Block 4, and the two-way interaction of (x i1 x k3 ) is confounded with Block 2. In the 3SLS causal structure D-optimal design, the number of candidate points is not replicated equally. For example, the candidate point (− 1, − 1, − 1) was replicated three times, but the candidate point (1, − 1, − 1) was replicated only once.
To understand the differences between the designs, it is important to contrast the asymptotic information matrix of the 3SLS D-optimal design for a causal structure with that of Table 5: Parameter values, estimates, determinants of information matrices, their ratios, and D-efficiency based on a FIML D-optimal design and a univariate optimal design for three simulations. the orthogonal design. In this way, we can see where information is gained and parameter estimates are improved for the former design and where information is sacrificed and parameter estimates worsen for the latter design. Tables 9 and 10 show the asymptotic information matrices for the model in Figure 5 for the orthogonal design and the 3SLS D-optimal design, respectively Table 9. When the information of the estimates is compared, we can see that the exogenous parameters in both models have the same information except for the elements. c 11 , c 31 . However, the endogenous parameters in the 3SLS D-optimal design gained more information with an increase of about 16% over the orthogonal design. ese results are expected because univariate optimal designs do not take the precision of the endogenous parameters into account, whereas D-optimal designs for causal structures do. erefore, our new approach has the advantage of giving the optimal combination of treatments for the entire model and taking into account the precision of both the exogenous and endogenous parameters.

FIML D-Optimal Design for a Causal Structure with
Random Blocks. As in the completely randomized D-optimal design for a causal structure, another important class of estimators is the maximum likelihood estimators. We will obtain the FIML estimators for a causal structure with random blocks and then obtain the covariance matrix estimates for the estimators, or the information matrix estimates. e information matrix estimates will then be used to obtain a D-optimal design for the FIML estimators.
Starting with the notation from Section 2.1, y * − W * δ * � and set the partial derivative equal to 0, which gives en, find the second partial derivative and take its expected value to obtain the information matrix, (z 2 L/z(δ * ) However, in general, we cannot find the expected value explicitly until the model is known since W * includes fixed and random variables. But, for Example 2 above the information matrix is simplified to [30].

Journal of Probability and Statistics
Based on the information matrix above, a FIML Doptimal design for Example 2 is equivalent to the design in Table 6. e determinant of the information matrix is 5,280,295.4. Neither the orthogonal design nor the uniform design was as efficient as the new D-optimal design. e determinants of the information matrices were 4,845,883.1 and 36,622 for the orthogonal design and uniform design, respectively. us, the D-optimal design was approximately 2.1% more D-efficient than the orthogonal design and 71.1% more D-efficient than the uniform design. ese results are similar to the D-optimal design for Section 1.2 where both the 3SLS and the FIML estimates had the same D-optimal design.

Discussion
Optimal design theory and its applications are generally limited to univariate designs. However, most natural phenomena include multiple endogenous variables which are related in complicated ways.
ere are two weaknesses concerning univariate designs. First, univariate designs are incapable of addressing more than one endogenous variable that could be affected by different exogenous variables. e second weakness is that univariate designs ignore the endogenous parameters. Our new approach overcomes both weaknesses.
e D-optimality criteria were developed for both the 3SLS and FIML estimates for a causal structure and for a causal structure with random blocks. In addition, our approach is broadly applicable since it generalizes to any linear causal model with both fixed and random effects, and with any type of nesting and factorial structure.
Our results show that the D-optimal designs were the same for both FIML and 3SLS estimates. e reader may wonder why they should care about 3SLS D-optimal design since FIML estimators have traditionally been used in structural equation modeling since 3SLS estimators are not based on optimization. However, 3SLS information matrices are mathematically easier to obtain than FIML information matrices. e interest here is in optimal design, not estimation. Since there was no difference between the FIML and 3SLS D-optimal designs for both examples and, in general, since the FIML and 3SLS information matrices are asymptotically equivalent [33], we     use the 3SLS information matrix both for its simplicity in obtaining a D-optimal design and to ease hesitancy for scientists to utilize the more complicated FIML information matrix.
In the example of the completely randomized causal structure, the new D-optimal design increased the information matrix determinant by at least 17% over the univariate optimal design. Because of this increase, the univariate optimal design was only 94% as D-efficient as the new D-optimal design. e information matrix of a causal structure depends on parameter values that are unknown.
is raises the concern of whether or not the D-optimal design is robust. Based on the simulation results from Example 1, the D-optimal design is robust because it did not change when the parameter values changed. e fact that the information matrix of a causal structure depends on unknown parameter values is not unique to causal structure D-optimal designs. e same issue arises for D-optimal designs for random effects, or the covariance parameters, in a split-plot or a blocked design as well where the values of the parameters are needed to obtain the Doptimal design. e results in our research are consistent with those of Goos and Vandebroek [14,15] and Goos and Jones [13] where the optimal design did not change when replacing the true parameters for an estimate.
Similarly, for a causal structure model with random blocks, the endogenous parameters had a significant impact on the D-optimal design. In the univariate mixed model, orthogonal designs are universally optimal [14,41]. However, it is not optimal for a causal structure model with random blocks. As shown in Example 2, the new D-optimal design increased the determinant of the information matrix by at least 9%, meaning that the orthogonal design was less efficient than the new D-optimal design. e orthogonal design was consistently about 97.8% and 97.9% as D-efficient as the new 3SLS and FIML D-optimal designs, respectively.
While it may be a disadvantage that D-optimal designs require an assumed model structure, the results show that the uniform design, which focuses on distributing the design points uniformly over the experimental domain, is not optimal. e uniform design was about 27.2% as D-efficient as the new 3SLS D-optimal design and 28.9% as D-efficient as the new FIML D-optimal design. In both cases, the uniform designs consistently lost more than 70% of the information in contrast to the new D-optimal design. Such a large loss of information will result in inflating the variance and has serious consequences on inferential statistics and the power of the statistical tests. e designs that were produced using the new criteria show that endogenous parameters have an important effect on D-optimal designs. e results further show the importance of taking advantage of prior knowledge of the interrelationships of endogenous and exogenous parameters to develop a D-optimal design using the algorithms and methodology developed here. An application for the use of our algorithms could be in research similar to the heart study by Mi and colleagues [42] who examined the complex interrelationships among physiological, genetic, and environmental factors and their interaction effects on coronary heart disease (CHD), as shown in Figure 6. e data in that study were collected and analysis was performed based on the data alone; there was no pre-planned design. To obtain the most precise estimates to describe the system, causal structure D-optimal design can be used to determine replication of the levels of each factor such as how many men and women to include in the study, how many of the subjects should be smokers, how many should be obese, or how to determine an optimal sample of volunteers based on budgetary constraints. Another practical use of our algorithms and methodology comes from the field of agriculture through research conducted by Campbell and colleagues [43], and Dhungana and colleagues [44] where the objective was to understand how gene-environment interactions (GEI) influence interrelated and complex traits such as grain yield. is subject of future work would help to answer questions like which genotypes should be chosen for the study and what is the combination of those genotypes to obtain optimal estimates. As a multi-stage experiment which took place over a period of three years, our methods could have been implemented to use the preliminary results from the first year of the study to answer those questions if the model was pre-determined.

Limitations.
e fact that the algorithm requires the model to be predetermined is a challenge because in most natural phenomena the interrelationships in a system could be complex. In univariate designs, the challenge is reduced because there is only one endogenous variable that is directly affected by an exogenous variable. However, in structural equation modeling, not all exogenous variables affect endogenous variables directly and the interrelationship among the endogenous variables needs to be predetermined, as well. An additional challenge to obtain a D-optimal design through our algorithm is that some of the estimates of the endogenous parameters are required before an experiment is conducted, which is unknown prior to the experiment. Addressing these limitations is also the subject of future work by using a prior distribution and a Bayesian approach to allow for the uncertainty of the unknown endogenous and exogenous parameters. Finally, how our technique works with other designs such as A-optimal, C-optimal, or Eoptimal designs is yet to be explored.

A. Derivation of the FIML Information Matrix
For the structural equation model in the form YB + XΓ + E � 0 where the rows of E are independently and normally distributed with vector mean zero and a p × p variance matrix Σ, the log likelihood function can be given by log L � Constant + n log|B| + (n/2)log|Σ − 1 | − (1/2)tr [(YB + XΓ) ′ (YB + XΓ)Σ − 1 ]. First, to obtain the FIML estimators, we find the partial derivatives and set them to zero as follows: Because of the invariance properties of the maximum likelihood (ML) estimators [45], it would simplify the derivative if we derive the likelihood with respect to Σ − 1 instead of Σ. So, Per Kmail [30], algebraic manipulation of equations (A .1) and (A.3) gives en, by rewriting the previous equation, Finally, which implies that, (A.13) e advantage of this notation is that the elements of δ are unknown, and it is an unrestricted set of equations. e restrictions on B and Γ have been resolved by the notation. However, there is a problem in solving for δ since G is unknown. In addition, Q (1) − y * ) � 0. But since dQ ′ G (1) and Q (1) dG are small in comparison to Z (1) G (1) , we ignore the third term and conclude that dδ � (Q (1) G (1) Q) − 1 Q (1) G (1) (y * − Qδ (1) ). erefore, Repeating the previous procedure, we can generalize the results for the (r + 1) estimates where δ ∧ * (r+1) � (Q r G r Q) − 1 Q r G r y * . Kmail [30] shows the asymptotic variance matrix of the estimates is the Var(δ

Data Availability
Data sharing is not applicable to this article as no data sets were generated or analyzed during the current study.