Robust Optimal Design of Quantum Electronic Devices

We consider the optimal design of a sequence of quantum barriers in order to manufacture an electronic device at the nanoscale such that the dependence of its transmission coefficient on the bias voltage is linear. The technique presented here is easily adaptable to other response characteristics. The transmission coefficient is computed using the Wentzel-Kramers-Brillouin (WKB) method, so we can explicitly compute the gradient of the objective function. In contrast with earlier treatments, manufacturing uncertainties are incorporated in the model through random variables and the optimal design problem is formulated in a probabilistic setting. As a measure of robustness, a weighted sum of the expectation and the variance of a least-squares performance metric is considered. Several simulations illustrate the proposed approach.


Introduction
Nanoelectronic devices operate with extremely low intensity currents. Under these circumstances, it is desirable to have at our disposal mechanisms to produce and control electronic currents with a high precision. Electronic beams are relatively easy to produce, but their filtering to obtain nanocurrents with specified properties is much more difficult. A widely used approach consists in directing the beam on a sequence of quantum barriers with an externally adjustable bias voltage applied throughout the device. One expects to be able to control the response of the device in the form of a current whose intensity depends, say, linearly on the applied bias. This setting naturally leads to an optimal design problem: What must be the width and height of the layers composing the barriers, supposed fixed in number, in order to achieve this linear response? (Of course, the problem is quite general, admitting a more complex relation between the external voltage and the current, but here we deal with the linear case just for simplicity). There should be no need to stress the importance of the solution to this problem from a practical point of view, but it must be noticed right from the start that a closed-form, analytic solution is impossible to obtain in most cases. The use of numerical computations at some stage is unavoidable, and this leads to the question of which method to use in order to obtain a good approximation to the solution. In [9] the non-constant potential energy profile is approximated by piecewise constant potentials. Then, the propagation matrix method [5,8] is applied to compute the transmission coefficient and, finally, the gradient of a least-squares-type objective function (which is required by the numerical solution method) is computed using the adjoint method. It is important to point out that different discretization processes, which are used to approximate objective functions and/or its gradients, may lead to very different results. Moreover, it has been observed in some optimization problems [7] that first approximating a cost functional, and then computing the gradient of the approximated one, in general differs from approximating the gradient of the exact cost functional. That is, the schemes 'first discretize, then optimize' and 'first optimize, then discretize', do not commute in general. Also, as it will be showed later on in this paper, optimizing for the same cost functional via its exact gradient gives different solutions than using an approximate one.
Another issue, which cannot be obviated in a realistic mathematical model, is the presence of uncertainties. There are several sources of uncertainty in the problem under consideration, one of the most important regarding the influence on the computed optimal design being the manufacturing uncertainties. Due to the smallness of the currents involved, and the narrow width of the quantum barriers needed, methods such as MBE (Molecular Beam Epitaxy) or CVD (Chemical Vapor Deposition) are used to growth thin layers (in many cases, monolayers) of some material to build the barriers, two of the preferred ones being M oS 2 and GaAs (see [2,6], for example). These methods allow the growth even of monolayers, but the difficulties inherent to the manufacturing process at a semi-commercial scale lead almost inevitably to inaccuracies that ultimately lead to a potential configuration that may be different from the numerically computed, optimal one [11]. For these reasons, the problem of computing an optimal quantum profile which, in addition, is robust against those uncertainties is an important one. If there is some statistical information about the uncertainties, then the machinery of probability theory gives a framework in which to we can accomodate uncertainties (by using random variables and/or random fields), and model objective functions (by means of expectation and variance operators, among others choices). In [14], this approach has been used for the case in which the cost functional only includes the averaging of a least-squares performance metric, and by using the standard Monte-Carlo method for its numerical resolution.
The present work addresses the problem of the optimal design of a quantum potential profile (modeling a nanoelectronic device) in order to obtain a transmission coefficient linearly depending on an externally applied bias voltage, in the presence of manufacturing uncertainties. The transmission coefficient is explicitly computed by using the WKB method. As a consequence, an explicit formula for the gradient of the cost functional is obtained. A weighted sum of expectation and variance of a random least-squares performance metric is considered as a measure of robustness. The inclusion of the second order statistical moment in the cost functional amounts to a reduction the dispersion of the random transmission coefficient and hence an increase in the robustness of the optimal design. Since the resulting integrand in the cost functional is smooth with respect to a random parameter, a sparse grid stochastic collocation method is used for the numerical approximation of the involved integrals in the random domain. This method preserves the parallelizable character of Monte-Carlo sampling. However, in contrast to Monte-Carlo (which is computationally very expensive, of order O M −1/2 , with M the number of random sampling points), the stochastic collocation method shows an exponential convergence with respect to the number of sampling points. Several simulations illustrate the proposed approach, which shows itself to be an improvement in accuracy over previous ones of about a 69.4%.

Setting of the Optimal Design Problems
Considered is a nanoscale semiconductor electronic device composed of N layers occupying positions x 0 = 0 < x 1 < · · · < x N = L. The local potential energy at the ith layer is denoted by U i , i = 1, 2, · · · , N . For x < x 0 , the potential energy is denoted by U 0 and for x > x N it is U N +1 . It is assumed that a single electron propagating from −∞ is incident at x 0 and that a voltage bias V bias is applied across the device. A linear approximation of the underlying Poisson's equation [9,13] leads to the following expression for the resulting potential energy profile where U = (U 1 , · · · , U N ) is the vector of local layer potentials in the device, and The transmission coefficient of the device T = T (V bias , U ) is defined as the ratio of current density transmitted from the device at x = x N and the incident one at x = x 0 . As explained in detail in [9], T may be expressed as where κ 0 = 2me (E − U 0 )/ and κ N +1 = 2me (E − U N +1 + V bias )/ , for values of the energy The cases E ≤ U 0 and E ≤ U N +1 − V bias may be treated in a similar way. Here m is the effective mass of the electron, e denotes the electron charge, is Planck's constant, E is the electron energy, and finally ψ (x) solves the following boundary-value problem for the Schrödinger equation Here i denotes the unit imaginary number and A 0 the amplitude of the transmitted wave at x 0 .

Deterministic Optimal Design
The (deterministic) optimal design problem considered in this paper is formulated as the following nonlinear data-fitting problem: Given a desired transmission coefficient T 0 (V bias ), which is defined for V min ≤ V bias ≤ V max , and lower, U L , and upper, U R , bounds for the local layer potentials, with 0 ≤ U L < U R < ∞, where T (V i , U ) is given by (2) with V bias = V i and V min ≤ V i ≤ V max , i = 1, · · · , M .

Optimal Design Under Manufacturing Uncertainties
As indicated in the introduction, it is very convenient to analyse the robustness of optimal designs with respect to manufacturing uncertainties. These may be modelled by adding a vector of random variables to the vector U of local layer potentials. Here ω represents a random event and thus X j (ω) is a small unknown error in manufacturing the local potential U j . Hence, the cost functional J considered in problem (4) becomes a random variable given by In order to obtain a design of the potential energy profile U less sensitive with respect to fabrication unknown fluctuations, the new cost functional is considered: with α ≥ 0 a weighting parameter. Here E and Var denote the expectation and variance operators, respectively. Then, the robust optimization problem is formulated as where J α (U ) is given by (7).

Solving the Optimal Design problems
The numerical resolution of the optimal design problems stated in the preceding section requires the computation of the transmission coefficient (2) and, therefore, the resolution of the boundaryvalue problem (3). This problem may be numerically approximated by standard numerical methods such as finite differences or finite elements. Another approach is proposed in [9] where, after approximating the potential V (x), as given by (1), by piecewise constant potentials, problem (3) is transformed into a two-dimensional linear non-autonomous difference equation. Here we propose a different approach based on the so-called WKB method [10]. From the point of view of optimization, WKB method is very appealing since, within its range of validity, it provides an explicit form for the solution to (3). From this, explicit expressions for the gradients of the cost functionals considered in problems (4) and (8) are derived. In addition, having an explicit expression for J (U, ω) allows us to prove its smoothness with respect to U and ω. From this, both existence of solutions to (4) and (8), as well as designing a computationally very efficient numerical resolution method, will be derived in this section. We begin by explicitly computing the transmission coefficient (2) and then describe the numerical resolution methods for problems (4) and (8).

Case of a single potential barrier
For the sake of clarity, consider first the case of a single potential barrier as illustrated in Figure  1.
Posistion, x An electron of mass m, charge e and energy E, incident from left, has wave vector k j in region j.

The WKB method proposes a solution of the Schrödinger equation in the form
with In the context of quantum electronic devices, solutions to (3) admit a smooth representative in their L 2 classes (of regularity class C 1 ). Hence, continuity of ψ and its first derivative at the interface point x 1 leads to where By writing each 2 × 2 matrix in (10) as the product of two matrices as follows and solving (12) for A 1 and B 1 , Proceeding in the same way at the point x 2 , one obtains where Substituting (14) into (13), we get where M 11 is the first entry of M . More precisely, the following explicit formula for that transmission coefficient, in the case E > V (x) for all x 1 ≤ x ≤ x 2 , is obtained: where C is given by (11) and The case E < V (x) for all x 1 ≤ x ≤ x 2 is completely analogous. Denoting by the transmission coefficient is given by

Range of validity for WKB method
The condition for the validity of WKB method and therefore for formulas (16) and (19) is that the change in the potential energy over the decay length be smaller than the magnitude of the kinetic energy (see, for instance, [13, p. 483]). For the case E > V (x), this condition can be expressed as and for In particular, (21) and (22) constraint the values of U and V bias for which the WKB methods applies. Assume that U > V bias : Hence, by introducing the function condition (21) is satisfied whenever F E>V (V bias , U ) > 0. Figure 2 displays the functions F E>V (V bias , U ) and its associated transmission coefficient T (V bias , U ), as given by (16) Analogously in the case E < V , by considering the function

Case of multiple potential barriers
Consider now the configuration described at the beginning of Subsection 2.1, which is composed of N potential barriers with energy given by (1). Let us denote by the potential energy of the jth barrier and by The linear relationship between the coefficients A 0 , B 0 of the electron wave function at x 0 and the ones A N , B N at x N is expressed as where the form of the transfer matrices M j depend on the relationship between E and V j . For the cases in which WKB method may be applied, M j has been computed in the preceding section. For the spatial regions for which WKB method does not apply, appropriate connection formulae must be used. For instance, as in [8,14], the linear potential V j (x) is approximated by piece-wise contant potential for which explicit expressions of M j are well-known [5]. Finally, as in the case of a single potential barrier, the transmission coefficient T = T (V bias , U ) is obtained from (27).

Numerical Resolution Method
Before describing the numerical methods proposed in this paper for solving (4) and (8), let us briefly consider the question concerning the existence of solution for such problems.
From the explicit expressions (16) and (19), and taking into the computations in Subsection 3.1.3, it is clear that the functionals J (U ) and J α (U ), which have been considered in problems (4) and (8), respectively, are smooth (of regularity class C ∞ ). In addition, the set of admissible designs As a consequence, the following existence result holds. The nonlinear mathematical programming problem (4) is standard and may be solved by several methods, typically by a gradient-based method. In this paper, a subspace trust region method, which is based on the interior-reflective Newton method, as proposed in [3,4], is used. This algorithm is implemented in the MatLab constrained optimization routine fmincon.
More challenging is the robust optimal design problem (8). The brute-force sampling Monte Carlo is the most commonly method used to solve this kind of problems. However, for smooth (with respect to the random parameter) functions, sparse grid, stochastic collocation methods are able to keep the same accuracy as Monte Carlo and, in addition, are computationally much more efficient [1]. In our case, again from (16) and (19) it is not hard to show that J (U, ω), as defined in (6), is analytic with respect to the random variable X (ω). For this reason, we propose an adaptive, isotropic, sparse grid, stochastic collocation method to approximate both, the cost functional J α (U ) and its gradient. Then, the robust optimal design problem (8) can be solved as in the deterministic case.
In order to explain the method for approximating integrals in the random domain used in this paper, let us first introduce some notation. By (Ω, F, P) we denote a complete probability space. Ω is the set of outcomes, F is the σ-algebra of events and P : F → [0, 1] is a probability measure. Γ j = X j (Ω), 1 ≤ j ≤ N , are the images spaces of the sample space Ω through the real-valued random variables X j considered in (5), and Γ = N j=1 Γ j is the product space. Assuming that the distribution measure of X (Ω) is absolutely continuous with respect to the Lebesgue measure, there exists a joint probability density function ρ : Γ → R + for X = (X 1 , · · · , X N ). Hence, (Ω, F, P) is mapped to (Γ, B, ρ (z) dz), where B is the σ-algebra of Borel sets on Γ, and dz is the Lebesgue measure. Finally, the expectation and variance of J (U, ω) take the form Following [1,12], the isotropic sparse grid of sampling quadrature nodes is defined as follows.
Starting from an integer (called the level), the index set is considered, with N + = {1, 2, 3, · · · }. The level determines the number of collocation points R in in the nth stochastic direction, which for the case of Smolyak rule is given by R in = 1, for i n = 1 2 in−1 + 1, for i n > 1.
Smolyak quadrature rule applied to a generic function F : Γ → R gives where ∆ in = Q in − Q in−1 , with Q 0 = 0, is a quadrature rule in which the coordinates z rn n of the nodes are the same as those for the 1D quadrature formula Q in and its associated weights w rn n are the difference between those for the i n and i n − 1 levels.
It remains to analyze the question on how to properly choose the quadrature level . For this: (1) A positive, large enough, integer , and a tolerance level 0 < ε 1 are fixed.
(2) The first and second order statistical moments of J (U, z), as given by (28) and the first term in the right-hand side of (29), are approximated by using (30), with level . These two approximations, which are denoted by M 1, (J (U )) and M 2, (J (U )), respectively, play the role of enriched or reference values for the exact values of the first two statistical moments of J (U, z), which, obviously, cannot be explicitly computed.
(3) Finally, the level is linearly increased from = 1 to opt < until the stopping criterion is satisfied. Here M 1, (J (U )) and M 2, (J (U )) denote, respectively, approximations, by using (30) with level , of the first two statistical moments of J (U, z). If (31) is not satisfied for the tolerance ε and the initial , then the reference level is increased.
For more details on this adaptive algorithm, including its convergence, we refer the reader to [1] and the references therein.

Numerical Simulations
In this section, numerical results for problems (4) and (8) are presented and discussed. In all experiments, a four layers device is considered of the same thickness (1nm), so that the total length is L = 4nm. The desired linear transmission coefficient is Quadratic and square root transmission coefficients may be treated analogously. The design is based on 10 equally spaced bias voltages. Hence, V i = i 0.25 10 , 1 ≤ i ≤ 10. The electron mass is m = 0.07 × m 0 , with m 0 = 9.10939 × 10 −31 Kg (which is appropriate for an electron in the conduction band of Al ξ Ga 1−ξ As), its energy is E = 0.026 eV, and its charge e = 1.602 × 10 −19 C. Planck's constant is = 1.05457 × 10 −34 J · s. The lower and upper bounds for the design variable are taken as U L = 0.7 eV and U H = 1.7 eV, respectively.
The goal of this section is twofold: On the one hand, it is aimed at analyzing the differences that may occur when using exact gradients or numerical gradients in the optimization algorithm. On the other hand, we want to analyze the influence of manufacturing uncertainties on the computed designs. We deal with these issues in the following subsections.

Exact versus numerical gradient
In this experiment, the deterministic problem (4) is solved, by using the MatLab routine fmincon, in the cases where: (a) The exact gradient is provided, as computed from the explicit expression for the cost functional J (U ) in Section 3, and (b) the gradient is numerically computed by using finite differences. In both cases, the algorithm is initiated with U 0 j = 0.7 eV, 1 ≤ j ≤ 4, and the stopping criterion, provided by the MatLab routine fmincon, is fixed to 10 −15 .
First column in Table 1 displays results for the values of the cost functional after convergence of the algorithm. The optimal energy potentials are showed in the remaining columns. Inspection of Table 1 reveals (as expected) that performance increases by using the exact gradient. Precisely, the value of the objective function obtained by using the optimal design as computed with the exact gradient improves in about 69.4% the corresponding one obtained via the numerical gradient.

Deterministic design versus design under uncertainty
Manufacturing uncertainties are modelled by random variables uniformly distributed in [−a, a], i.e., X j = U (−a, a) for all j = 1, 2, 3, 4. As an illustration, the cases a = 0.05 and a = 0.2 are considered. As in the preceding example, the algorithm is initiated with U 0 j = 0.7 eV, 1 ≤ j ≤ 4, and the stopping criterion is fixed to 10 −15 . Case 1: a = 0.05, which represents 5% of error in manufacturing each one of the potentials U j . The algorithm described in Subsection 3.2 has been implemented by using the Sparse Grids Matlab kit 15.8 (see http://csqi.epfl.ch and [1]). The stopping criterion (31), with ε = 10 −7 , is satisfied for = 20 and opt = 15, which corresponds to 895 collocation nodes in the random domain Γ = [−0.05, 0.05] 4 . Figure 4 displays the rates of convergence of the isotropic sparse grid algorithm. Table 2 shows the results, after convergence of the algorithm, for the expectation and the standard deviation of the cost functional J (U, ω) given by (6). As expected, the optimal design in mean (α = 0) provides a solution which reduces the impact of manufacturing errors in comparison with the deterministic approach (see first and second rows in the first column of Table 2). It is also observed that increasing the value of the weighing parameter α reduces the dispersion of the computed designs. These results are more significant when the level of uncertainty increases, as pointed out in Table 3. Case 2: a = 0.2, which corresponds to 20% of noise. The same procedure as in the preceding case has been applied. in this case, (31) is satisfied, with ε = 10 −2 , for = 20 and = 16, which corresponds to 1212 collocation nodes.
The same qualitative results as in the preceding case are observed in Table 3. However, since in this case the level of manufacturing uncertainties is higher, the differences of corresponding solutions are much more significant than in the preceding case. Indeed, for 20% of noise, the reduction in the mean value of J (U, ·) for the mean optimal value (α = 0), in comparison with the deterministic optimal design, is of the order of 39%. For the case of 5% of noise, this reduction   (6) for the optimal deterministic design (first row), the optimal design in mean (second row), and the optimal design for α = 10 12 (third row).  Table 3: Case a = 0.2. Mean (first column) and standard deviation (second column) of the cost functional (6) for the optimal deterministic design (first row), the optimal design in mean (second row), and the optimal design for α = 10 10 (third row). Columns 3 to 6 show the corresponding optimal potentials. is of the order of 4.2%. The decreasing if the standard deviation of the computed designs is also much more significant in the current case than in the case of 5% of manufacturing noise.

Conclusions
We have addressed the problem of determining optimal designs of nanoelectronic devices whose physical behavior is governed by the time-independent Schrödinger equation. Two situations are considered: A deterministic version of the problem, and the more realistic case where manufacturing uncertainties are accounted for. In both cases, the corresponding optimal design problems are formulated as the minimization of a least-squares performance metric. In the stochastic case, the variance of the random metric is incorporated in the cost functional as a measure of robustness. An explicit expression for that metric is computed using a semi-classical approximation. Having at our disposal an analytic expression for the cost functional has the following advantages: (a) At the theoretical level, it is easily deduced that the cost functional is smooth (of regularity class C ∞ ) with respect to the design variables. As a consequence, the existence of solutions for both optimization problems is proved.
(b) In the deterministic case, an explicit expression for the gradient of the cost functional is obtained. Numerical simulations in Subsection 4.1 show the relevance of this issue, at least in the specific problem considered in this work, when gradient-based minimization algorithms are used as the numerical resolution method.
(c) In the stochastic optimization problem, it is also deduced that integrands, which appear in the considered cost functional, are analytic with respect to the random vector parameter. Thus, stochastic collocation methods (which possess an exponential rate convergence with respect to the number of sampling points and are, computationally, much more efficient than the classical brute-force Monte-Carlo method) are preferred.
The results obtained show an improvement of the accuracy in the linear response characteristic of about 69.4% over previous, brute-force, approaches. Moreover, the robustness of the design is manifest even under weight values of α = 10 12 in the variance (physically, as seen in in Table 2, this would amount to pass from a design using Germanium, with a band gap of 0.7eV, to one using Silicon, whose band gap is 1.1eV, so that extreme case is still physically feasible).
As noticed along the paper, the semi-classical approach used in this work has the drawback that constraints the values of the applied bias voltage and those of the local energy potentials. Accordingly, the whole range of possible energies and potential profiles may be covered by using the approach proposed in this work in combination with appropriate connection formulas, for instance, the ones presented in [9].