The Impact of Statistical Leakage Models on Design Yield Estimation

Device mismatch and process variation models play a key role in determining the functionality and yield of sub-100nm design. Average characteristics are often of interest, such as the average leakage current or the average read delay. However, detecting rare functional fails is critical for memory design and designers often seek techniques that enable accurately modeling such events. Extremely leaky devices can inﬂict functionality fails. The plurality of leaky devices on a bitline increase the dimensionality of the yield estimation problem. Simpliﬁed models are possible by adopting approximations to the underlying sum of lognormals. The implications of such approximations on tail probabilities may in turn bias the yield estimate. We review di ﬀ erent closed form approximations and compare against the CDF matching method, which is shown to be most e ﬀ ective method for accurate statistical leakage modeling.


Introduction
With technology scaling, memory designs are the first to suffer from process variation. Density requirements in memory cells, latches, and register files aggravate variability effects and often undergo performance and yield degradation. More recently, the trend is starting to become more visible in peripheral logic [1], and random variations can lead to reduced noise margins. From a leakage perspective, general interest has been to model the statistical leakage distribution of the full-chip [2]. However, under extreme device conditions, leakages may inflict delays or faulty behavior: example false reads. In memory designs, bitline leakages can further aggravate the variability effects and impact readability of the design. Often many devices are stacked on the bitlines, and it has become necessary to statistically account for and model those many independent and, in most cases, identical, leaky sources. A typical array column can have around 16 to 64 cells per bitline. With the aim of capturing rare fail events accurately and modeling the variability space [3], it becomes important to emulate those independent variability sources by a single statistically equivalent device, the main goal here being to reduce the dimensionality space. This is especially important for response surface modeling-based integration methods or variance reduction techniques.
The threshold voltage variation due to random dopant fluctuation [4] for the single device, however, is different from that of the equivalent device. In a typical linear model, the equivalent statistical model would be derived from linear relationships with respect to the individual devices, and what the designers often refer to as the square root law; as the device area (A) increases, the threshold voltage variation scales ∼ 1/ √ A. This would let us model "n" small devices with one large device whose variation is ∼ 1/ √ n times the original variation. However, as was indicated in [5,6] this model lacks accuracy when modeling the highly nonlinear leakage currents where the dependency is exponential on the threshold voltage (the source of variation). In fact, as we will explain in the following sections, the leakage current is distributed lognormally, and the problem of the equivalent device is known mathematically as the sum of lognormals. Mathematically, a true and exact closed form does not exist for the method, and historically there have been many approximations. There is a lot of literature on the sum of lognormals, and applications span a wide range of domains including economics, finance, actuarial sciences, WWL "1" WBL RWL RBL WBL Figure 1: A schematic of an 8T cell/register file. During Read, the read wordline "RWL" turns ON. If the cell storage node is held high, "1", the read bitline (RBL) is discharged. and others, with engineering being one of the oldest. The most famous approximations to the sum of lognormals are Fenton-Wilkinson's [7] and Schwartz-Yeh (SY) [8]. For circuit design, those models have been studied in [5,6] to enable modeling the threshold voltage distribution of the equivalent device. The authors in [5] relied on the Fenton-Wilkinson's (FW) method to maintain a reasonable 3-sigma estimate (99th percentile). It is often indicated that when σ dB > 4, FW method may not result in proper approximation of the lognormal distribution. On the other hand, the authors in [6] proposed to use the Schwartz-Yeh (SY) methodology to model the equivalent threshold voltage distribution of a 3-fin device. Again, the approximation is qualified, in this case for a small number of summands, based on a thousand samples thereby ignoring the accuracy of the model in the tail regions which is critical to estimation of low fail probabilities. In the following sections, we review the modeling trends in the critical regions, focusing on the cumulative density function (CDF) matching method [9] as a viable solution for reliable statistical modeling when the tail region modeling is key to the accuracy of the yield estimate. The paper is organized as follows. Section 2 introduces a basic circuit example to describe the need for the single equivalent device modeling. Section 3 reviews the mathematical background and assumptions of the most common sum of lognormals approximation methods. It also provides a summary chart of the different approaches under study. Section 4 evaluates the impact on fail probability estimations. Finally conclusions are presented in Section 5. Figure 1 illustrates an 8T cell/register file. The circuit operates like a 6T SRAM cell during cell Write. During Read, the read wordline turns ON. If the cell is storing a "1" on the right-side-node, in this example, the Read stack will turn ON, thereby discharging the read bitline (RBL).

Equivalent Device Modeling
Driven by density requirements, many cells share the same bitline and sense amplifier (or read logic). Only one cell per column is accessed at a given time; other cells remain unaccessed. For the example of Figure 1, the RWL of the unaccessed cells is turned OFF. However, based on the data storage, it is possible that the data stack device turns on, thereby rendering the stack leaky. Figure 2 illustrates a scenario where the accessed cell is OFF. RBL is expected to remain high. However, leakages through the accessed stack and unaccessed stacks may falsely discharge the RBL node. This together with noise, mismatch in the read logic, and other environmental effects can degrade the yield of the system and increase the probability of a false read.
Note that this phenomenon is also manifested in other memory and domino logic designs. For example in the case of 6T SRAM, the devices sharing the WBL (see Figure 1) can be storing 1's thereby leaking onto the WBL and degrading the Read "0" performance in a 6T fashion. Hence there is a need to account for the statistical effects of the leaky devices in any design yield optimization and study. This can dramatically increase the dimensionality of the problem thereby reducing the efficiency of statistical integration, response surface methods, or any methods that try to model the failure region and that suffer from sparsity and the curse of dimensionality. Consider for example a study that involves variability in 64 devices on the bitline along with variability sense amp devices. Thus, a model for a statistically equivalent device to emulate the multiple leaky devices can significantly simplify the complexity. Figure 3 summarizes the problem.
The threshold voltage variation δV T is normally distributed with zero mean and standard deviation σ V Ti = σ 0 .
That is, δV Ti ∼ N(0, σ 0 ) Find: δV Teqv distribution for the device T eqv such that " n * I Teqv " is distributed according to " I Ti ".
Note that we assume here independent and identical devices; the problem can be generalized to nonidentical/correlated devices.
Leakage Current Model. Leakage current as a function of the threshold voltage variation can be modeled according to (1). Hence it portrays an exponential dependency on the random variable δV T I leak = e aδVT +b , q is the electron charge, K is boltzmann constant, and T is the temperature. For simplicity, T eqv has the same device width as the other T i , but it is followed by a current multiplier to model the equivalent leaky device and solve for the new δV Teqv distribution: the set of (2) represents how the problem can be reduced. Parameters Z i are normal variables, e Zi are lognormal variables (see next section for definitions), and e Zeqv (and Y ) are distributed according to a sum of lognormal. A sum of lognormals is not distributed lognormally, and hence Z eqv is not normal. In the following VLSI Design 3 RWL 0 Cell 0 = "0" section we review the sum of lognormals characteristics and possible methods of approximation that can be used to generate the statistical characteristics of Z eqv and hence δV Teqv I sum = I leak1 + · · · + I leakn = n * I leak-eqv , e aδV Ti+b = n * e aδV Teqv+b ,

Sum of Lognormals
In this section we will focus on methods to approximate models for Y (2), and compare their accuracy to the CDF matching approach. First we present some background information.
Lognormal Distribution. Let Z = ln(X). If Z is normally distributed ∼ N(μ z , σ z ), then X is lognormally distributed and its probability density function is [10] Often the Gaussian Variable V = 10 log 10(

Sum of Lognormals.
A common way to compute the probability density function (pdf) of a sum of independent random variables, would be from the product of the characteristic functions, CF, of the summands [11]; a characteristic function of random variable Z is the expected value (E[e jwZ ]). However, for the case of the sum of lognormals, Y = e Zi , a closed form of the CF does not exist, and approximations are used instead to represent the density function of the sum of lognormals. Some approximations Figure 3: The problem of multiple devices leakages can be statistically equivalent to that of a single device. The distribution of the threshold voltage of the equivalent device is to be solved for.
• Y fw matches 1st two moments of Y • Recursive solve for μ and σ of ln(Y ) • Match moments of ln(Y ) • Non gaussian • Derived from the piecewise linear distribution include modeling the sum of lognormals as an "approximately" lognormal variable (see (4)); where Z m . is normally distributed (m refers to the method of approximation) Common techniques are SY and FW methods mentioned earlier. In the following section, we will revisit those methods and compare them to CDF matching. Figure 4 summarizes the four techniques that will be visited:  (3) Log Moment Matching approximation, (4) CDF matching.
All these methods except the CDF matching one assume that Y is a lognormal variable; that is, that Z m (and hence δVt) is a normally distributed Gaussian variable whose mean and standard deviation are to be found.
It is a well-known fact [10] that given a Gaussian variable, x, with mean and standard deviation μ x and σ x then (6) holds by relying on (5) and (6) we can obtain a closed form for μ Zfw and σ Zfw according to (7) for the case of independent variables z i ∼ N(μ i , σ z ) σ Zfw = ln (e σz 2 − 1) e 2μi ( e μi ) 2 + 1 ,

Schwartz and Yeh.
Again the methodology starts with the assumption that the sum of lognormals is approximately lognormally distributed where Z sy is a Gaussian variable ∼ N(μ Zsy , σ Zsy ). It approximates the mean and standard deviation of the Z SY = ln(Y ), with numerical recursion; whereas FW estimates the first two moments of Y . Schwartz and Yeh method relies on the ability to exactly compute the mean, μ Zsy , and standard deviation, σ Zsy , of Z SY = ln(Y ) for the case when the number of lognormal variables in the sum n = 2. For n > 2, the method 6 VLSI Design  then relies on a recursive approach, adding one factor at a time. Hence, the mean and standard deviation of Z sy = ln e Z1 + · · · + e Zn (9) can be derived from the following set of generalized equations (10); Z sy(k−1) is assumed to be normally distributed; k = 2, . . . , n, and the Z i are assumed to be uncorrelated in (10) Z sy(k) = ln e Zsy(k−1) + e Zk , μ Zsy(k) = μ Zsy(k−1) + G 1 , where w k and G i (i = 1, 2, 3) are defined in (11). Finally set of (12)-(13) illustrates how the G i 's can be computed according to [12]; this slightly modified implementation was intended to circumvent the round off error of integration of the original Schwartz and Yeh implementation [13] w k = Z k − Z sy(k−1) ,  G 1 = A 0 + I 1 , Thus, at each step of the recursion, we compute the mean and standard deviation of w k . The integrals I i are then numerically computed using the functions h defined in (13). Their values are used to solve for the G i 's in order to evaluate (10). The final estimate for the Z SY mean and standard deviation is reached at k = n

Log Moment Matching Approximation.
While the previous two approaches find the moment matching using analytical or recursive analytical solutions, this approach relies on sampling instead to compute the moments according to (10) [9]. Similar to the SY method, it estimates the 1st two moments of ln(Y ). It does not pertain to a closedform solution, but it maintains the lognormal assumption. Thus, the recursive analytical approach in SY and this approximation do not always converge as we will see in Section 3.5, especially that the SY assumptions is held exact 3.4. CDF Matching. This is the closest possible to the true sum. Unlike the previous approximations, it does not rely on the lognormal approximation. We build a piece-wise linear probability density function from the Monte Carlo samples. Our goal is to demonstrate the difference between this approach and other lognormal approximations, when it comes to tail regions modeling and the resultant probability of fail estimations. Key features of the problem are (i) the piece-wise linear fit for the density function of ln(Y ) is non-Gaussian.
The pwl function can be sparse in the center of the distribution and more dense in the tails to adequately model the low fail probability region. In an extreme fashion, the tail probabilities can be recomputed from tail samples to avoid interpolation errors.
(ii) generating the Y samples is cheap, so is the I sum sample once the function or even tables of (I, δV T ) are available. Interpolation, bootstrapping, and other techniques can reduce the number of real simulations needed and still enable good confidence in the density function. After all, the previous approaches do rely on the availability of a closed form function for I(δV T ).
The number of samples is inversely proportional to the tail probability of interest; for example, if we are looking for accurate probabilities in the range of 1e − 4, then we need to have replications of samples that are larger than 1e4. Replications add to the confidence in the expected tail probability. After all the interest is in the CDF tails mainly, (iii) finally, this model can accommodate any complex nonlinear function (I = f (δV T )); even if it is different from the exponential approximation above, (iv) most importantly, once the distribution of I sum is available, V T distribution is derived by reverse fitting ( f −1 (I sum )) samples.
Note that for purposes of comparison, in this study all methods share the same exponential current model.

Theoretical Experiments.
In this section, we compare the different methods ability to model Y (Figure 4). We study both the moment matching abilities as well as the upper and lower CDF tails. The study is performed over different combinations of n and σ z (the standard deviation of Z i ): (σ z = 0.5, 1, 2, 3, 4) × (n = 2, 8, 16, 64) (15) σ z range corresponds to the range 2 dB-16 dB. Recall that often 4 dB is considered as a critical threshold for how accurate FW methods can model the distribution. Also as demonstrated in [6] standard deviation of the threshold voltage of scaled devices can exceed 4-8 dB (this is based on the leakage coefficient in (1)). For each (n, σ) combination, multiple replications of 1 million samples are studied; this enables good estimates for the low fail probabilities. Figure 5 plots the mean and standard deviation of ln(Y ) obtained from the different approximations. FW method falls behind for larger σ z (>1 or 4 dB). Recall that this method was intended to match the moments of Y and not ln(Y ). We note that the FW estimates do underestimate the mean and overestimate the variance of ln(Y ) for larger dB values. Finally, the SW and LMM methods do match the CDF well, yet they do differ from each other a bit for larger sigma values, given that the former is a recursive approximation. Figure 6 illustrates histogram (pdf plot) of ln(Y ); the FW method does not model the body of the (Z or δVt) distribution well compared to other methods; this was also indicated in [6] for small n. The trend is even more obvious as the number of summands (n) increases. However, to obtain a complete picture, there is a need to study the lognorm matching and the tail regions which we will cover next. Figure 7 plots the mean and standard deviation of Y obtained from the different approximations. SY and LMM methods that match moments of ln(Y ) falls behind for larger σ z (>1 or 4 dB). This is true for large n (>2), and we note that SY tends to particularly underestimate the variance of the sum of lognorms; this effect is most visible when the variables are identical and uncorrelated as is the case in this study. To study the tail regions, we rely on a "tail log plot" as illustrated in Figure 8. Without loss of generality we set the x-axis in these plots to be in dB; a small shift can mean large change in Y values. The plot is derived from the density function (P(y > y0)) and its complement (1 − P). It is such tail probabilities that are linked to the fail probabilities of a design. Note that for the case of leakages, the right-side tail is critical for the fails (larger Y values correlate with larger leakage values in real applications). Figure 9 illustrates the tail probability plot as function of σ z for n = 64. We get good match for all the methods for small σ. For critical σ values (4-8 dB), we note that SY and FW do miss the right tail modeling. As σ increases, FW tries to catch towards modeling the right tail by missing on the left tail model. Figure 10 illustrates the tail probability plots as function of n for σ z = 2 (8 dB). SY and LMM methods have larger errors in modeling the right tail. FW error increases with increasing n.

Case Study: Leak-Down Time Comparisons
In this section, we extend the analysis to study the impact of the different modeling schemes, compared to the CDF matching method, on the probability of fail estimations. Figure 11 illustrates a summary of δV Teqv distribution approximations based on the relation between δV Teqv and Y in (2); except for the CDF matching method, δV Teqv is modeled as Gaussian random variable ∼ N(μ eqv , σ eqv ). The distributions are then used to analyze the leak-down of RBL in the circuit of Figure 12. The time for the bitline to discharge to 50% of the rail under extreme noisy corner voltage is then estimated. Figures 13, 14, and 15 illustrate the normalized time-toleak distributions for the case of 16 cells/bitline. s 0 represents a lower limit on the threshold voltage standard deviation of a 45 nm cell device; its value is set to the equivalent of σ z = 0.6 10 VLSI Design       (or 2.5 dB). This is conservative especially that as technology scales we expect more variability and additional sources of variability like the random telegraphic noise can add to the V T variation; cases for standard deviation of 1.3s0 and 1.6s0 are also plotted. SY, FW, and LMM methods overestimate the time-to-leak from 10% for s 0 to close to 100% for 1.6s 0 (see horizontal arrow in the figures; this corresponds to system yield around 4.5 sigma). More importantly, this leads to underestimating the probability of the number of elements failing at a given leak-time. Note that time-toleak values can be critical relative to operating frequencies 1E Figure 14: Normalized time-to-leak probability plot for σ VT ∼ 1.3s 0 . and accurate prediction is needed for robust designs. Thus, we are interested in computing the ratio of the probability of fails (vertical arrows in figures) for predicted time-toleak values. Figure 16 summarizes the ratio of the true (cdf) probability of fail compared to that of the other methods (SY, FW, and LMM) at their 4.5 sigma yield leak-time. Each experiment is based on the average of 25 × 1 million replications. This is done for the case of 16 and 64 cells/bitline and at increments of 0.1 * s 0 . We note that the SY, FW, and LMM methods underestimate the probability of fail 10×-147×.

Conclusions
We study the ability of different sums of lognormal approximation to emulate the leakage of multiple leaky devices by a single equivalent device. With the goal of rare event estimation, tail distributions are examined closely. Modeling the tail probability by CDF matching is found to be critical compared to the Fenton-Wilkinson and Schwartz-Yeh methods that are found to underestimate the sum of lognorms and hence overestimate the fail probability by 10×-147×; this trend is expected to increase as the variability increases with scaling technology.