Bayesian Nonparametric Modeling for Rapid Design of Metamaterial Microstructures

We consider the problem of rapid design of massive metamaterial (MTM) microstructures from a statistical point of view. A Bayesian nonparametric model, namely, Gaussian Process (GP) mixture, is developed to generate the mapping relationship from the microstructure’s geometric dimension to the electromagnetic response, which is approximately expressed in a closed form of Drude-Lorentz typemodel.This GPmixturemodel is neatly able to tackle nonstationarity, discontinuities in themapping function. The inference is performed using a Markov chain relying on Gibbs sampling. Experimental results demonstrate that the proposed approach is highly efficient in facilitating rapid design of MTM with accuracy.


Introduction
Metamaterials (MTMs) are artificially engineered materials that exhibit extraordinary electromagnetic (EM) response, such as negative refraction, superresolution imaging, and planar slab lensing [1][2][3], which cannot be found in naturally available materials.The macroscopic properties of MTMs are harnessed by engineering the geometric dimensions of artificially engineered subwavelength MTM particles, which are required to generate the desired magnetic and/or electric responses to externally applied fields [4][5][6][7][8][9].The electromagnetic responses are usually characterized by constitutive parameters like permittivity and permeability.Values of such parameters can be tailored by changing the geometric dimensions of the MTM particles.Recently, a great deal of research has been involved with designs of MTMs with inhomogeneous constitutive parameters, based on which technique of optical transformation can be implemented as a powerful tool to manipulate electromagnetic waves in various desired manners [6,[10][11][12][13][14][15][16][17][18].However, optical transformation dramatically increases the complexity of MTMs, because the constitutive parameters are often spatially inhomogeneous [6,11,12,14,[16][17][18] and anisotropic [17,18].In addition, the overall size of the MTMs can be much larger [6,14].The large number of particle elements required in a MTM design can represent a substantial computational burden resulting in long design cycles.
To design a MTM, analytical method can be derived by setting up equivalent circuit models [4,[19][20][21] for the MTM particles.The problem is that such analytical method is only efficient for limited classes of particle structures and is often unable to accurately predict the macroscopic behavior of MTMs.In practice, MTMs are more often designed by full-wave electromagnetic simulations such as finite element method (FEM) and finite integration technique (FIT).Commercial software like FIT-based CST Microwave Studio and FEM solver of ANSOFT HFSS has been widely used.The common approach to design a MTM requires repeating simulations to optimize the constitutive parameters for any single MTM particle.Optimization techniques, such as density method [22], genetic algorithm [23,24], and topology optimization [25][26][27], have been taken into account in improving the efficiency in designing MTM particles.We argue that, even with such optimization techniques, it would still be too time-consuming for designing MTMs which contain massive different particles, because the fullwave simulations consume a considerably large amount of time and the optimization requires running a large number of simulations.

International Journal of Antennas and Propagation
To address the aforementioned time-consuming design step, Liu et al. developed an automated and rapid method that is applied once the spatial distribution of the constitutive parameters has been determined [6].Differently with previous methods, the approach in [6] requires only a comparatively small number of full-wave simulations, relying on a simple statistical regression modeling scheme to mathematically generate the functional dependence of the constitutive parameters on the geometric dimensions of the MTM particles.
Following the idea of employing statistical techniques to reduce the times of full-wave simulations, we introduce an advanced statistical modeling scheme, namely, Bayesian nonparametrics, into the context of rapid design of MTMs.Compared with [6], which did not disclose any details either for the regression model or for its associated inference algorithm, this work reveals all the necessary details for the models and the corresponding inference algorithm.A more significant difference lies in the statistical model.In [6], a simple polynomial-like model is used to fit a series of responsive data points which can be connected by a smooth curve.In contrast, our model can not only fit smooth data, but also allow nonstationarities and/or discontinuities embedded in the data to be fit, attributed to the idea of combining mixture models with Bayesian nonparametrics.In another word, we provide a more flexible model that has predictive ability even for much more sophisticated data which is likely to appear in the practice of MTM design.With this model at hand, it is easy for us to predict the constitutive parameter values for any specific design for a large, diverse range of MTM particles.By doing so, lots of full-wave simulations can be avoided, even for very complicated MTM design tasks.
The remainder of this paper is organized as follows.Section 2 presents the proposed GP mixture modeling approach as well as the Gibbs sampling based inference mechanism.Section 3 presents the experimental results of a MTM prototype designed with our method.Finally, we conclude the paper in Section 4.

Methodology
In this section, we first provide a brief review on the basic idea of GP regression.Then, we introduce the mixture of GP regression model as well as the proposed inference algorithm.The mixture of GP regression model is used as a surrogate model for the full-wave simulations required in MTM design.The efficiency of our approach is demonstrated by real MTM experiments as presented in Section 3.
2.1.Overview on GP Regression.GP regression has been proven to be powerful tool to deal with nonlinear regression problems [28].The problem of (possibly nonlinear) regression can be stated as follows: assume that we are given some noisy data  = {(  ,   )  =1 },   ∈  = R  ,   ∈  = R,  ∈ {1, . . ., }, where  is the number of data points and  is the dimensionality of input vectors.Let  be drawn independently and be identically distributed from a probability density (, ) = ( | )().The purpose is to find a regression function  ∈ F,  :   →  such that the risk   [((), )] is minimized, where  : ×  → R + specifies the pointwise regression loss; in our case the quadratic loss ( 1 ,  2 ) = ( 1 −  2 ) 2 is used, and F denotes the function set.
GP regression deviates subtly from the standard formulation above because it is really a transductive method that does not provide a single regression function  but a posterior density over target values for the test or working set.
A GP is completely specified by its mean function and covariance function.The mean function () and the covariance function (,   ) of a real process () are defined as below: The covariance function specifies the covariance between pairs of random variables as follows [29]: where   is called characteristic length-scale, which informally can be thought of as roughly the distance you have to move in input space before the function value can change significantly.
In the above we take the mean function to be zero, which is usually adopted for notational simplicity.
Prediction of function values  * corresponding to new test inputs  * can be sampled from the joint posterior distribution by evaluating the mean and covariance matrix from the equation as below: where (, ) denotes the matrix of the covariances evaluated at all pairs of points in  and points in  [8], and N(, ) denotes the Gaussian density function with mean  and covariance .
Observe that the length-scale   can be varied.In general, such free parameter is called hyperparameter.The hyperparameter is set by optimizing the log marginal likelihood [29]: where  is the covariance matrix for .As the observations, namely, the electromagnetic response, are assumed to be noise-free, the covariance matrix for  is identical with the covariance matrix for the noise-free latent . is computationally impractical for large datasets [28].Second, it is difficult to specify priorly and perform learning in GP models if we require nonstationary covariance functions, multimodal output, or discontinuities [28][29][30][31][32].

Mixtures of GP Regression
We develop a GP mixture model to conquer the above limitations that may appear in the context of MTM design.The idea of combining GP with mixture modeling framework has appeared in the literature, see, for example, in [30][31][32], while the model we develop here is custom-tailored to the problem of MTM design and the inference mechanism is different.We summarize the operations of the proposed method in a flowchart as shown in Figure 1.We introduce each part of this algorithm as follows.

Input Training Data.
The training data represents a set of {  ,   }  =1 which is obtained from full-wave simulations, where  denotes the data size,  denotes the geometric parameter of the microstructures, and  denotes the parameter of the numerical model that is used to approximate the relevant electromagnetic response data.

Initialization of GP Mixture Model.
The GP mixture modeling and inference mechanism is initialized by a random partition operation.Denote c = ( 1 , . . .,   ) the configuration vector, where   is the discrete indicator variable assigning data points to clusters.Denote  the number of clusters.Initialize the value of   ∈ {1, . . ., } using roulette wheel selection method [33,34] to partition the data set  = {(  ,   )}  =1 into  clusters randomly and uniformly.

Parameter Optimization for GP Mixture Model.
For each cluster, a GP model is prescribed using the data points belonging to this cluster as the training set.The hyperparameter of the GP is set by optimizing the marginal likelihood by particle swarm optimization (PSO) [35].Such PSO algorithm produces a globally optimal estimate of the hyperparameter.

Dirichlet Process-Based Data
Clustering.The operation of data clustering is conducted based on Dirichlet Process (DP), and the purpose is to find an optimal allocation of the data points into different clusters.DP can be defined as the limit of a Dirichlet distribution when the number of clusters tends to infinity [36][37][38][39][40]. Let us begin with a symmetric Dirichlet distribution on proportions: where  is the (positive) concentration parameter.The conditional probability of a single indicator is derived to be for components, where  −, > 0, and for all other components combined, where c − denotes all indicators except number ,  −, = ∑   ̸ =  (   , ) denotes the occupation number of cluster  excluding observation , and the delta function selects data points assigned to class .It is shown that the probabilities are proportional to the occupation numbers.To bring in input dependence, we estimate this occupation number as follows [30]: where   is the kernel function parameterized by .We use a Gaussian kernel function:

Stopping Rule.
We record the clustering results of five last iterations.If they are identical, we determine that the Markov chain has adequately sampled the posterior and then stop the iterative process; otherwise, continue the iterations.

Experimental Verification
In this section, we present experimental results of a MTM prototype designed with the proposed approach.To begin with, we introduce the related background information on our experiment.

Background.
Our MTM sample prototype is fabricated on copper-clad printed circuit board with FR4 substrate (the substrate thickness is 0.2 mm, with a dielectric constant of 3.85 + 0.02).This prototype is shown to be a cube in physical appearance.The spatial distribution of the desired refractive index is predetermined as shown in Figure 2. The refractive index is designed to change along the  coordinate and keep unchanged along the  coordinate of the sample.The rate of change in the refractive index along the  coordinate is constant.The purpose of using such a spatial distribution of the refractive index is to achieve the effect of steering one polarized EM wave beam to a desired direction, at an ad hoc frequency point, 13 GHz in our experiment.The deflection angle of the incident wave in an inhomogeneous material mainly depends on the following two factors: (1) the rate of change in the refractive index along the  coordinate of the MTM and (2) the number of material layers along the  coordinate [3].Denoting  as the deflection angle, it satisfies where  is the number of material layers along the  coordinate and Δ is the difference in the refractive index for any two neighboring unit cells distributed along the  coordinate.We select the maximum and minimum values of the refractive index to be  max = 2.2,  min = 1.3, respectively, and then distribute 200 and 80 microstructures along the  and  coordinates, respectively.In another word, the  (11).See Figure 3 for the simulation result of the macroscopic property of this design.
As the spatial distribution of the refractive index has been determined, the remaining task is to find appropriate microstructures to fill the spatial area of the MTM sample.The microstructures are repeated periodically and fit in the planar slab's thickness.The periodicity along the -, -axes is  = 2.5 mm and  = 2.5 mm, respectively.
We use a microstructure which is similar as used in [6].The microstructure's geometric topology is characterized by parameter "".The overall topology and a cross section of the microstructure are shown in Figures 4 and 5, respectively.As is shown, the geometric parameter "" just controls the physical appearance of the microstructure.The value space of "" is specified to be [0.2 mm, 4.4 mm].In cases with  ≤ 2.3 mm, the cross section of the microstructure is shown to be a rectangle with height "" and fixed width 0.2 mm; otherwise if 2.3 mm <  ≤ 4.4 mm, it becomes an "I" shape.Although simple, this type of microstructure has been successfully used to design complicated MTMs, for example, in [6]; we adopt this topology here due to its simplicity in parameter representation and capability for designing complicated MTMs.
The relationship between the microstructure (as shown in Figures 4-5) and the macrostructure of the MTM prototype (corresponding to Figure 2) is conceptually explained by Figure 6.The finally fabricated MTM prototype consists  3 mm, the shape of a cross section of the microstructure is a rectangle with height "" and fixed width 0.2 mm; the right one shows that, in case of 2.3 mm <  ⩽ 4.4 mm, the cross section of the microstructure has an "I" shape.
of multiple pieces of MTM as conceptually illustrated in Figure 7.

The Modeling Process.
Recall that to build up a statistical model we first need to collect some data  as presented in Section 2. We call such data training data, which consists of a set of predetermined "" values and the corresponding EM response data.Given a specific value of "", the responsive data is obtained over a predetermined range of frequencies via full-wave simulations.To use a lower dimensional  in the process of building the regression function  :   → , we intend to find a closed form expression for the responsive data.
The unit cell size in our design is about one-tenth of the wavelength, and so the procedure of homogenization enables effective constitutive parameters-the electric permittivity and the magnetic permeability-to be defined and used to characterize the composite [41].However, the retrieved parameters always display anomalous and often nonintuitive behavior due to the spatial dispersion effect [41,42].The unfamiliar response leads to a significant discrepancy between the theoretical prediction, which is typically in the form of a Drude or Drude-Lorentz model [43][44][45], and the practical retrieval result.We get out of this obstacle using the spatial dispersion model developed in [42], which establishes the relationship between a theoretical local field average response and the retrieved macroscopic equivalent electromagnetic parameters.Then, a Drude/Lorentz-type model developed in [46] is utilized to fit the local field average effective parameters at the tested frequency points.This model assumes that the EM response parameter, permittivity, or permeability, over a predetermined frequency range, is expressed in the form as follows: where  denotes the frequency,  0 denotes the resonance frequency, { 0 ,  1 ,  2 ,   } constitute the parameter set of the model, and "data" denotes the local field average parameter values on the tested frequency points, which are obtained by modifying the full-wave simulation data via the spatial dispersion model [42].For details about the derivation of model (12), readers are referred to [34][35][36].There are also several papers in the literature, for example, Chapter 3 of [3] and [42,[47][48][49], which discussed the Drude/Lorentz models in detail.We selected model (12), other than the other similar Drude/Lorentz models in the literature, because model (12) gives a good numerical stability indicated by our experimental tests.An example fitting result with this model for a set of local field average parameters, namely, the electric permittivity , is depicted in Figure 8.As is shown, this model provides a very accurate fit to the resonant data.
We also give a very brief review on the spatial dispersion effect that causes unusual form of the constitutive parameters International Journal of Antennas and Propagation  obtained from the retrieval methods [42] and the closed form expressions we used for the constitutive parameters.

The Measurement Result.
We obtained the training data by first drawing a set of values of geometric parameter "" uniformly from its value space and then getting the corresponding EM responsive data.Given the training data, we constructed the GP mixture model and made predictions of the EM response parameters for massive different microstructures.Then we got a library containing the massive geometric designs as well as their corresponding EM characteristics.In Figure 9, we just plot the achievable scope of the refractive index for this type of microstructure with 0.2 mm ≤  ≤ 4.4 mm.Finally, from the library, we obtained the desired geometric parameter values of all the MTM particles which constitute the MTM sample.
We monitored the running speed of the proposed modeling approach in predicting EM responses for arbitrary microstructure designs and compared it with that of a standard CST simulation running on the same computer.In average, the time spent on one microstructure is of the order    of tens of milliseconds for the proposed approach, while it is of the order of minutes for CST.In summary we could predict at least one thousand times of speedup using our approach compared with standard simulation methods.The real test result for our MTM sample, including the mapping of the field, the magnitude, and the phase, is presented in Figure 10, which indicates an angle of deflection 19 ∘ .It is shown that the real result is very close to the desired value 21.1 ∘ .

Concluding Discussions
Motivated by vast needs to implement rapid design of complicate MTMs, which usually consist of very large number of inhomogeneous microstructures, we introduce an advanced Bayesian modeling scheme, namely, GP mixture model, into the context of MTM design.This model is used to generate the mapping function from the structure's geometric dimension International Journal of Antennas and Propagation to its corresponding EM response.Thanks to the predictive property of this model, lots of time-consuming full-wave simulations are avoided in the phase of determining appropriate geometric parameter values for the microstructures.
The experimental result of a MTM prototype demonstrates that the proposed method can facilitate the rapid design of MTMs to a large extent with high accuracy.
In this paper we just select a simple microstructure topology as an example to present our approach, while we argue that the proposed modeling scheme is adaptable to other topologies.To prove this claim conceptually in a clear way, we could separate our modeling scheme into two parts.The first part is represented by (12) which parameterizes the EM response data; the second part is the mixture of GP regression model, which connects the geometric dimension "" to parameters of (12).The first part, that is, (12), is only applicable to EM response data with no or one resonance pattern.If there are two or more resonance patterns in the EM response data, (12) will not work; then we need to find a new model (probably with more parameters) to parameterize the EM response.The second part of our approach, namely, the mixture of GP regression model, is itself a generic datadriven modeling approach, which means the structure of the model is adaptively determined by the data to be modeled.Such kind of data-driven modeling technique has been a hot research topic recently in communities of machine learning and Bayesian statistics.We introduce it to the new application area: MTM design, which also requires advanced statistical stuff as shown here.

.Figure 1 :
Figure 1: Flowchart of the mixture of GP regression algorithm.

Figure 2 :
Figure 2: Schematic diagram to show the spatial distribution of the refractive index and the corresponding desired effect on modulating EM waves.

Figure 3 :
Figure 3: Comsol simulation result corresponding to the spatial distribution of the refractive index shown in Figure 2.

Figure 4 :
Figure 4: Two examples showing the microstructures' topology: the left panel corresponds to a case with  = 1 mm, and the right panel with  = 4.4 mm.See explanation of parameter "" in Figure 5 and the text.

Figure 5 :
Figure 5: Parametric representation of the topology of the microstructures under use: the left hand figure shows that, if  ⩽ 2.3 mm, the shape of a cross section of the microstructure is a rectangle with height "" and fixed width 0.2 mm; the right one shows that, in case of 2.3 mm <  ⩽ 4.4 mm, the cross section of the microstructure has an "I" shape.

Figure 6 :
Figure 6: A conceptual show for the relationship between the microstructure and the macrostructure of the MTM prototype.

Figure 7 :
Figure 7: A schematic diagram of the MTM prototype.

Figure 8 :
Figure 8: An example showing the discrete real and imaginary local field average  values and the corresponding fitted curves yielded by the numerical model.

Figure 9 :
Figure 9: The range of the real part of the refractive index over frequency band 5∼15 GHz.

Figure 10 :
Figure 10: The upper plot shows the measured 2D field mapping (E-field); the middle and the downmost figures correspond to the magnitude and the phase, respectively, at the frequency point 13 GHz.The arrow shows the incident direction of the EM wave.