Much of the previous work in D-optimal design for regression models with correlated errors focused on polynomial models with a single predictor variable, in large part because of the intractability of an analytic solution. In this paper, we present a modified, improved simulated annealing algorithm, providing practical approaches to specifications of the annealing cooling parameters, thresholds, and search neighborhoods for the perturbation scheme, which finds approximate D-optimal designs for 2-way and 3-way polynomial regression for a variety of specific correlation structures with a given correlation coefficient. Results in each correlated-errors case are compared with traditional simulated annealing algorithm, that is, the SA algorithm without our improvement. Our improved simulated annealing results had generally higher D-efficiency than traditional simulated annealing algorithm, especially when the correlation parameter was well away from 0.
D-optimality is a popular criterion for optimal experimental design. The model for polynomial regression can be written as in Zhu et al. [
In some experimental settings, the observations may be correlated according to various structures or patterns. The motivation of the research in optimal designs with correlated observations can be found in Dette et al. [
The simulated annealing algorithm is a probabilistic “hill climbing” algorithm for optimization in the absence of an analytical solution. The application of simulated annealing algorithm for optimal design problem was first proposed by Haines [
Most previous work only considered the simplest case, that is, optimal design for 1-way polynomial regression. However, in real world problems, the response variable is usually influenced by multiple effects and their interactions. This kind of problem is more complicated and cannot get satisfactory result by existing algorithms or their generalizations.
In this paper, we propose a modified, improved simulated annealing algorithm to approximately solve for D-optimal design for 2-way and 3-way polynomial regression with correlated observations. This algorithm is applicable to any number of observations, not necessarily a multiple of the dimension of the parameter vector. It conquers the shortcoming of previous work, which mainly concentrated on the case that
The full model for second-order 2-way and 3-way polynomial regression is presented by Boon [
The design matrix is
The model for the second-order 3-way polynomial regression is
The design matrix is
D-optimality aims to maximize of the determinant of the information matrix, where the information matrix for these models is
We define commonly used correlation structures below for a single correlation parameter Circulant correlation: see Zhu et al. [ Nearest neighbor correlation: see Zhu [ Autoregressive correlation: see Dette et al. [ Completely symmetric block structure: see Cadima et al. [
Here
Note that one commonly used block correlation structure is proposed by Atkins and Cheng [
with
The simulated annealing (SA) algorithm belongs to a class of heuristic probabilistic hill-climbing algorithms; see Zhu [
For 2-way polynomial regression, the
In many applications of simulated annealing, the values of only one current design point are perturbed (by some random mechanism) at each iteration, and typically a systematic pass is made through all design points in this manner, and the process repeated until “convergence” is achieved according to a specified stopping condition. Alternatively, all design points are perturbed simultaneously. However, both of these traditional methods were found to be inefficient for our D-optimal design with correlated errors. Thus, we used a modification that improved convergence and solution quality. Our modification was to divide the design points into three parts, of equal or nearly equal size, and perturb all points in each part in an “inner” loop, while systematically doing this for each of the three parts. This represented a middle ground for the perturbation scheme between the two traditional perturbation methods, one at each extreme, as described above.
Our modified simulated annealing algorithm was as follows.
Initialize starting temperature
Divide the
Outer loop: cycle through each of the 3 parts of
Inner loop: let generate new candidate design points determine if
If
For 3-way polynomial regression, the only difference of the algorithm is in the inner circulation; we do all of the operation on 3 vectors:
(1) There are 2 vectors,
(2) We shrink the search neighborhood and increase the threshold for accepting a perturbation each time we lower the temperature. That is, when the temperature is high, we search in a wide neighborhood and are more likely to jump out of the local optimum. At each time we lower the temperature, we make the perturbation neighborhood smaller and make the acceptance threshold higher so it becomes harder to leave a local optimum. We implement this approach by multiplying the scale number
This approach is in accordance with the idea of simulated annealing; that is, when the temperature becomes lower, the “molecules” are less active and tend to an equilibrium stabilization. This modification resulted in improved relative efficiency of the final design.
(3) In each part of Step
In this part, we compare the results from our improved simulated annealing algorithm with that of traditional simulated annealing algorithm, that is, the SA algorithm without our improvement. Since the most often used correlation parameters are 0.1 and 0.4, in Tables
2-way polynomial regression with autoregressive correlation.
|
|
Traditional annealing determinant | Improved annealing determinant | Ratio |
---|---|---|---|---|
6 | 0.1 | 231.4 | 281.2 | 1.2152 |
6 | 0.4 | 578.2 | 751.8 | 1.3002 |
12 | 0.1 | 13582 | 17769 | 1.3083 |
12 | 0.4 | 31721 | 45108 | 1.4220 |
18 | 0.1 | 195260 | 272620 | 1.3962 |
18 | 0.4 | 416720 | 889690 | 2.1350 |
2-way polynomial regression with circulant correlation.
|
|
Traditional annealing determinant | Improved annealing determinant | Ratio |
---|---|---|---|---|
6 | 0.1 | 218.4 | 279 | 1.2775 |
6 | 0.4 | 712.5 | 1047 | 1.4695 |
12 | 0.1 | 12492 | 17815 | 1.4261 |
12 | 0.4 | 43962 | 65894 | 1.4989 |
18 | 0.1 | 143620 | 206010 | 1.4344 |
18 | 0.4 | 623820 | 1091400 | 1.7495 |
2-way polynomial regression with nearest neighbor correlation.
|
|
Traditional annealing determinant | Improved annealing determinant | Ratio |
---|---|---|---|---|
6 | 0.1 | 212.4 | 279.1 | 1.3140 |
6 | 0.4 | 534.2 | 742.5 | 1.3899 |
12 | 0.1 | 23982 | 32901 | 1.3719 |
12 | 0.4 | 45842 | 74276 | 1.6203 |
18 | 0.1 | 136922 | 206010 | 1.5046 |
18 | 0.4 | 639520 | 1175800 | 1.8386 |
2-way polynomial regression with block correlation.
|
|
Traditional annealing determinant | Improved annealing determinant | Ratio |
---|---|---|---|---|
12 | 0.1 | 18204 | 25088 | 1.3782 |
12 | 0.4 | 22912 | 39870 | 1.7401 |
Circulant correlation structure with various
|
| ||||
---|---|---|---|---|---|
7 | 8 | 9 | 10 | 11 | |
0.1 | 517.3 | 2523.2 | 4417.6 | 6738.3 | 16975 |
0.2 | 1261.1 | 3666.9 | 7672.5 | 16211 | 21788 |
0.3 | 1958.1 | 5540 | 16406 | 31880 | 42016 |
0.4 | 4046.7 | 13514 | 52982 | 61529 | 64205 |
2-way polynomial regression with
Correlation structure |
|
Nonreheated determinant | Reheated determinant | Ratio |
---|---|---|---|---|
Nearest neighbor | 0.4 | 74276 | 97284 | 1.3098 |
Circular | 0.4 | 65894 | 87291 | 1.3247 |
Autoregress | 0.4 | 45234 | 68548 | 1.5154 |
3-way polynomial regression with
Correlation structure |
|
Traditional annealing determinant | Improved annealing determinant | Ratio |
---|---|---|---|---|
Nearest neighbor | 0.1 |
|
|
1.3101 |
Nearest neighbor | 0.4 |
|
|
1.8438 |
Circular | 0.1 |
|
|
1.3692 |
Circular | 0.4 |
|
|
1.6342 |
Autoregress | 0.1 |
|
|
1.2921 |
Autoregress | 0.4 |
|
|
1.7342 |
In Table
Tables
From these tables, we see that under all of the cases, the determinants obtained by our improved simulated annealing algorithm are much higher than that of the traditional simulated annealing algorithm. When
For the case that the observations number
Table
Table
From Table
This paper demonstrates that an improved simulated annealing algorithm can successfully determine highly efficient D-optimal designs for second-order polynomial regression on
The SA algorithm needs only a well-defined energy function to maximize here the determinant of the information matrix. Thus, the same algorithm may be used for other design optimality criteria, for example, A- and E-optimality. In the absence of exact analytic optimal designs when errors are correlated, the SA algorithm is an attractive, easily implemented method to find highly efficient designs. Extensions to higher degree polynomial regression models are immediate, except for the likely need for longer run times and slower reduction of the temperature to allow for more effective searching over a larger design region.
The authors declare that there is no conflict of interests regarding the publication of this paper.