Backward Propagated Capacitance Model for Register Transfer Level Power Estimation

We present a new approach to the power modeling of functional modules, referred to as
the backward propagated capacitance model, for estimating the power consumption of
VLSI systems that are described at the register transfer level (RTL). To construct the
proposed model, we investigate the effect of the module's internal capacitance on power
consumption at the gate level. Then, we store the effect in a library in terms of the
equivalent input capacitance of the module. The equivalent input capacitance is used to
compute the module's power without the lower level elaboration during the power
analysis of the RTL system. In the experiment using benchmark functional modules, the
proposed model showed the absolute modeling error of 1.39% on average. For the
benchmark RTL systems, the proposed model exhibited the absolute error of 3.04% in
power estimation on average. If signal characteristics deviate from the modeling
condition, the modeling error may increase. Experimental results show that the
modeling accuracy can be improved greatly by using a simple compensation method.


INTRODUCTION
Today's VLSI system consumes a large amount of electrical power, as exemplified by the recent high- performance microprocessors that consume tens of watts [1][2][3].As the system speed becomes faster and the system function gets more complex, power consumption tends to become larger.Thus, low power became one of the important design issues for current VLSI systems [4].Power minimization is stressed at all levels of design hierarchy, and we need to minimize power consumption whenever possible [5][6][7][8][9].However, as the system architec- ture can be easily explored, a design can be optimized more effectively at high levels such as the register transfer level (RTL).Thus, it is very * e-mail: jychoi@cafri.postech.ac.krCorresponding author.Tel." + 82-54-279-2227, Fax: + 82-54-279-5933, e-mail: youngk@postech.ac.kr e-mail: krcho@cbucc.chungbuk.ac.kr meaningful to develop accurate and efficient RTL power estimation techniques that can be used early in the design process.
For power estimation, RTL systems are divided into combinational and control logic circuits.The proposed approach is applicable for a variety of combinational logic circuits.In case of combina- tional logic circuits, the major effort has been put to develop accurate and efficient techniques to estimate the power consumption of functional modules for various input signal characteristics [10].Among them, early techniques are based on the concept of the gate equivalent, which is an average number of reference gates required to implement a specific functional module [11,12].
The power consumption of the functional module is estimated to be the product of the gate equivalent and the average power consumption of the reference gate.This approach is very efficient.However, Glaser et al., use the data of the single reference gate and do not take account of the module types [11].In addition, they use fixed activity factors for input signals, independent of their patterns.Thus, the method suffers from large analysis error.In an improved method [12], Svensson et al., adopt the customized estimation techniques for the functional modules of different types.However, the method still uses fixed activity factors.Compared with this, recent approaches construct a power model for each type of functional modules [13]; i.e., they analyze the functional modules through simulation, and store the information on power consumption in a library.Then, they refer to the library when analyzing RTL systems designed using the func- tional modules.Among them, the power factor approximation method [14] uses the input signal of the uniform white noise (UWN) for simulation.Thus, it exhibits the estimation error of up to 80% in comparison with gate level estimation results [15].In the DBT power model [16, 17], Landman et al., divide the input data into two regions, most significant bit (MSB) and least significant bit (LSB) regions.Then, for the simulation to extract model parameters, they apply the input signals with strong temporal dependency to the MSB region, while applying the UWN to the LSB region.With this simple method, they succeeded in improving the modeling accuracy greatly.The DBT power model has 10-15% error compared with the results of the switch level simulator, IRSIM-CAP [17].However, the DBT power model is more suitable for DSP applications than other applications, and users need to provide the power models of the functional modules in analytic expressions that depend on the module func- tionality.This is a big burden on users.In [18][19][20], Nemani et al., proposed to estimate the power consumption using only the functional description of the system such as Boolean equations.The advantage is that they do not require building a library in advance.But, these approaches may lead to inaccurate analysis results, because they need to make some assumptions and approximations to find the average activity and the area of the functional module before implementation.Nemani et al., expressed that their approach has 33.7% error compared with gate level estimation results.
In this paper, we present a new RTL power model, referred to as BPCM (Backward Propa- gated Capacitance Model), which represents the power behavior of the functional module as the equivalent input capacitance for various input signal characteristics.The proposed model is constructed as follows: first, we expand the given RTL module into the gate level.Then, we visit the internal nodes of the given RTL module back- ward, beginning from the output.During the visit, we move the capacitance at each internal node in the direction of the input, while maintaining the power consumption same.During the process, we use the internal signal information that we can obtain through the gate level simulation with the full delay model.This is referred to as the backward propagation of the internal capacitance.When reaching the input nodes, we can represent all internal capacitance as the equivalent input capacitance of the module, vhich is stored in a library for future use.Since the proposed model is constructed through the investigation on the effect of the internal capacitance on power, it can characterize the module's power consumption more accurately than existing library-based models [14,16,17,21] in the wide range of input signal characteristics.In addition, the model parameters of the multi-input module are extracted for each input node, one by one and independent of other inputs.Thus, the module power can be character- ized through simple procedures, and can be stored in a one-dimensional table using a small amount of memory.The good trade-off ability between complexity and accuracy makes the proposed model attractive to use for the power analysis and optimization at the high level of abstraction.This paper is organized as follows.In Section 2, we describe the principle and embodiment of the proposed power model, BPCM.In Section 3, we present the application of BPCM to power analysis and experimental results, and, in Section 4, we conclude the paper.C1 to Ca are the input capacitance, C5 and C6 are the internal capacitance, and C7 is the output capacitance of the module.
The principle of the proposed model, BPCM, is to represent the internal capacitance, C5 and C6, as the input capacitance of the module, which is equivalent in power consumption, as shown in Figure l(b).In the figure, C5 and C6, has been removed.Instead, Cg was replaced by C; that represents the sum of Cg and the equivalent input capacitance, A Cg, coming from C5 and C6.Note that, unlike the internal capacitance, the output capacitance C7 remains untouched, as the output node appears in the RTL description of the system.With the equivalent input capacitance, the power consumption of the module is computed as follows: where V is the power supply voltage and f is the clock frequency synchronizing the RTL system.And, n and m are the number of the inputs and outputs of the module, Si is the switching activity at the input i, and So and Co are the switching activity and node capacitance at the output o, respectively.Here, the switching activity is the average number of 0--* or 1 0 transitions that a logic signal makes per clock period [22].
into (2) yields ACi=f(P1,P2,...,Pn), i=l,...,n (4) 2.2.Derivation of the Proposed Model The amount of power that a functional module consumes depends on not only the switching activity but also the signal probability at input nodes, where the signal probability is the prob- ability that the logic state of the signal will be '1'.Thus, the equivalent input capacitance at input i, A C, coming from the internal capacitance is given as follows: ACi f (P1, P2, Pn, S1, S2, Sn), 1,... ,n ( where Pi and Si are the signal probability and switching activity at input node of the functional module, respectively.From (2), ideally, we need to characterize the power behavior of the functional module for various combinations of signal probabilities and switching activities, preparing for all the cases that can happen during the actual operation.However, in that case, the modeling equations or the look-up table will be (2 x n) dimensional, and extensive simulation will be required to obtain the necessary data on the functional module.Thus, the method will be obviously too complex to be practical.Considering this, most approaches [14, 20, 21] assume that the input signals are UWN.However, these input vectors are certainly incapable of representing all important input conditions that the functional module would see during actual operation.
To derive the proposed approach, we assume that there are no glitches and no temporal correlation in the input signals of the functional module.Under the assumption, the following relationship holds [22]: where Pi and Si are the signal probability and the switching activity at input node i. Substituting (3) Thus, the power characterization becomes n dimensional.The compensation for the cases that do not satisfy (3) will be described in Section 2.5.
Although the above method reduces the modeling dimension from 2n to n, (4) is still too complex for practical use.Therefore, when extracting A C, we set Pj 0.5, j i, to reduce the modeling complex- ity further.Then, A C is represented in the following one-dimensional form: ACi f (Pi), i=l,...,n and Pj 0.5, j From ( 5), we use the following input vectors for the simulation to obtain the information for modeling; for input i, we use the input vectors with various signal probabilities between 0 and 1, and for other inputs, we apply the input vectors with the signal probability of 0.5.

Backward Propagation of the Capacitance
The basic operation to obtain the equivalent input capacitance of the functional module from internal capacitance is the backward propagation of the capacitance, which converts the output capaci- tance of a gate to its equivalent input capacitance.This is illustrated in Figure 2, where both circuits show the same gate, but before and after propagating C3 backward.Notice that, after the backward propagation, the capacitance C3 was removed.Instead, the equivalent capacitance was added at the gate input, shown as C1 and C2.To maintain the power consumption of the both circuits same, we have to conserve the total switched-capacitance after the backward propagation.Let Si represent the switching activity of the signal at node i.Then, for the circuit of Figure 2(a), the switched-capacitance is given as follows: SWC(a)--$3 C3 (6) Assume that there are no glitches in the input, the input transition probability is evenly distrib- uted in time, and the gate has the zero delay.Then, for the 2-input AND gate, the following relation- where Pi is the signal probability at node i.Thus, the switched-capacitance is given as follows: SWC(a) (S1 P2 C3) -+-($2 el C3) (8)   On the other hand, the total switched-capacitance of the circuit of Figure 2(b) is given as follows: SWC(b) S1 C1--S2 X C2 (9) Thus, the comparison of ( 8) and (9) leads to the following equivalent input capacitance: The above equations show that the equivalent input capacitance depends on the input signal probabilities.In addition, the relationship of the signal characteristics, shown in (7), becomes more complex as the gate has more input nodes, and it is affected by the gate functionality.As this compli- cates the backward propagation of the capacitance, we approximate that all input signal probabilities are same to improve practicality.
Then, from (10a) and (10b), we obtain C1 C2.Let M= C1 C2.The total switched-capacitance of the circuit of Figure 2(b) is given as follows: SWC(b) S1 M+S2M=(SI+S2)M (ll) From ( 6) and (11), By extending the above idea, we model that all inputs of the n-input general gate have the same equivalent capacitance after the backward propagation of the output capacitance.Thus, the conservation of the switched-capacitance leads to the following formula: where M is the equivalent capacitance at each input node, Si is the switching activity at the ith input of the gate, and So and Co are the switching activity and node capacitance at the gate output.

Construction of the Proposed Model
For a given functional module, we construct the proposed model by obtaining the equivalent input capacitance at each input node, one by one.For the modeling, we need the switching activities at the internal nodes, and they are obtained through the gate level simulation of the functional module with the full delay model.For the simulation, we use the input vectors, generated as described in the end of Section 2.2.
To obtain the equivalent capacitance at the input node of the functional module, we need to visit the internal nodes backward from output nodes systematically, and we need to propagate the internal capacitance backward until we reach the input nodes.For this, we define the levels of the internal nodes of the functional module as follows, and visit them in the decreasing order of levels, where level (i) is the level of node i: level (i) Max (level (j) + 1), Vj (14) In ( 14), j is the fan-in nodes of i, and the level of all input nodes of the module is defined as 1.
After obtaining the equivalent capacitance val- ues at each input node for various signal probabilities, we store them in a library in the form of the following polynomial after regression [23]: Ci an Pn + an-lP n-1 + + alP + ao (15)   where P is the signal probability at the input node and aj is the polynomial coefficient.In (15), n is determined considering the accuracy and efficiency required.An example is shown in Figure 3, where the thick line with lozenges is the equivalent input capacitance at "carry in" of a 1-bit full adder.In this example, we obtained the equivalent input capacitance at "carry in" for the signal probability from 0 to 1 with the step of 0.1, and, then, we represented the capacitance using a second-order polynomial.This BPCM polynomial is stored in a library, and is used to retrieve the equivalent input capacitance at "carry in" of the 1-bit full adder, when analyzing the power consumption of the VLSI system.

Effect of the Switching Activity on Power Consumption in BPCM
During the derivation of the proposed model, BPCM, we assumed that there are no glitches and no temporal correlation in the input signals of the functional module, i.e., the following relationship holds: Si 2Pi(1 Pi) (16) where Pi and Si are the signal probability and switching activity at input node i.Let A Ci denote the equivalent input capacitance of the functional module, coming from the internal capacitance.Then, the power consumption by the internal capacitance at input is given as follows: PWi 0.5 V 2 X f {ACi Si) V 2 f X ACi x {Pi (1 -Pi)} (17)   However, in general, the assumption of ( 16) does not hold in VLSI systems, and this may result in inaccurate power estimation.The power dependency on switching activity is illustrated in Figure 4, which compares the power consumed by the internal capacitance of a 4-bit x 4-bit multiplier for various signal probabilities, while changing the switching activity.The data was obtained through the gate level analysis with the full delay model for gates.In the figure, large bullets represent the points where the relationship of ( 16) holds.From the figure, it is shown that the power consumption increases almost linearly with the switching activity, and the slopes are similar regardless of the value of the signal probability.This is a common tendency in most functional modules, while they exhibit different slopes.Thus, we compensate the power consumption by A C at input for switching activity Sj, as follows: (ss where a is the slope of the power variation with the switching activity, and P Wg is the power consumed at input for the signal of the switching activity, S, given in (17).Since different signal probabilities exhibit similar values of c, we use the value of c at the signal probability of 0.5 to compute P Wi (S) in (18).values of a for all input nodes of the functional module, we store them in a library along with the BPCM polynomials.

APPLICATION TO POWER ANALYSIS AND EXPERIMENTAL RESULTS
To analyze the power consumption of the RTL system, we first simulate the given system using the user-specified input vectors, and obtain the signal information at the input of all functional modules in the system.Next, using the signal information, we obtain the power consumption of each module in the system, by referring to the BPCM library.Then, the sum of the power consumption of all functional modules becomes the system's power consumption.This process is illustrated in Figure 5.
We evaluated the accuracy of BPCM using DesignPower, a gate-level power analysis program Input vectors RTL simulation Signal information BPCM library FIGURE 5 The process of power analysis using the proposed model, BPCM.
from Synopsys.As test circuits, we used bench- mark functional modules, and RTL systems designed using the functional modules.To calcu- late the power of the test circuits using Design- Power, we used the signal information in the circuits, which we obtained through the gate-level simulation with the full delay gate model.As the simulation input, we used an input vector of 10,000 sequences with 10 nSec data period.
Table I shows the modeling error of BPCM for various functional modules, found in the Synopsys Design Ware library and ISCAS'85 combinational The table shows that the proposed model, BPCM, exhibits the absolute modeling error less than 1.4% on average, and about 1.6% error for C3540 that is as large as to contain 1,788 gate cells.
Table II shows the analysis error of BPCM for the RTL systems, designed using the functional modules that were presented in Table I.The table shows that BPCM exhibits the analysis error for RTL systems, which is similar to the modeling error for the functional modules.But, in this case, the error increased a little bit because we obtained the necessary signal information of the system through the RTL simulation that uses the zero delay model for the functional modules, while DesignPower used the full delay model for logic gates.
The circuit forfour rules of arithmetic, in Table II, contains 120 functional modules.Figure 5 com- pares the power estimates for each functional module by BPCM and DesignPower.In the figure, the X and Y axes represent the power estimates by DesignPower and BPCM, and each dot corre- sponds to a functional module.If the power esti- mates by BPCM are exactly same as those by DesignPower, a 45 straight dotted line corner appears from the left bottom to the right top.
From Figure 5, it is observed that BPCM provides accurate power estimates for all functional mod- ules inside the circuit consistently.
We characterize the power consumed at each input node of the given functional module, one by one.And, for the characterization, we fix all input signal probabilities at 0.5 except the input under characterization.However, the input signals may deviate from this condition, and Figure 7 illus- trates the modeling error of BPCM for this case.For the experiment, we chose a 4-bit CSA (Carry Select Adder) and a 16-bit SUB (Subtractor) as test circuits, and changed the signal probabilities of all input signals together between 0.1 and 0.9.
As we expected, the error is smallest when the signal probabilities are 0.5, and it increases as the signal probabilities deviate from 0.5.As the worst case, the error becomes as large as 8% when all input signal probabilities are 0.1 or 0.9.However, this seldom happens during the actual operation, and it has been reported that most of the input bits except the sign bit are very close to UWN in nature [241.
In Figure 8, we show the modeling error of BPCM for various signal probabilities of the sign bit, while fixing the other signal probabilities at 0.5.From the figure, it is observed that the modeling error of BPCM is less than 1.6% in the wide range of the signal probability.
In deriving BPCM, we assumed that the relationship of (3) holds between the signal probability and the switching activity of input signals.However, some input signals may not satisfy the relationship, and in this case, we compensate the power consumption for the switching activity using the formula of (18). Figure 9 compares the modeling error of BPCM before and after the compensation.For the experiment, we changed the switching activities of all input signals together from 0.1 to 1.2, while fixing their signal probabilities at 0.5.In the figure, the dotted lines and solid lines represent the modeling error of BPCM before and after the compensation, respectively.Similarly, Figure 10 shows the improvement of modeling accuracy by FIGURE 9 Modeling error before and after compensating for switching activity when all input signals do not satisfy the relationship of (3).
16-bit SUB (before compensation) Error[%] + 16-bit SUB (after compensation) 1.5 "'"1 0.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.1 1.2 Switching activity FIGURE 10 Modeling error before and after compensating for switching activity when the sign bit does not satisfy the relationship of (3). the compensation when only the sign bit does not satisfy the relationship of (3).In both cases, it is observed that the compensation of (18) reduces the modeling error of BPCM greatly.

CONCLUSION
In this paper, we presented an accurate power model, called BPCM, that can be used to analyze the power consumption of the RTL system with- out low level elaboration.The proposed model represents the power consumption of the internal capacitance of the fur/ctional module as its equivalent input capacitance.The advantage of the proposed model is that it can characterize the power consumption of functional modules for the wide range of input signal characteristics accu- rately.In addition, as the model parameters are extracted for each input node individually, inde- pendent of other inputs, the power library can be constructed through simple characterization pro- cedures, and the model parameters are stored in a one-dimensional table using a small amount of memory.Experimental results show that BPCM has the absolute modeling error of 1.39% for the benchmark functional modules on average, when compared with the gate-level power estimator, DesignPower.For the benchmark RTL systems, BPCM exhibited the absolute analysis error of 3.04% on average.For the derivation of BPCM, we assumed that there are no glitches and no temporal correlation in the input signals of the functional module.However, the input signal may not satisfy these conditions.Experimental results showed that the compensation using (18) improves the modeling accuracy of BPCM greatly.From the experiment, it is concluded that the proposed model, BPCM, can be used to estimate the power consumption of the RTL systems accurately.

TABLE Modeling
As the input, we used UWN.In the table, Error is defined as follows:

TABLE II
Analysis error of BPCM for RTL systems (UWN Power estimates comparison on the functional modules inside the Circuit for four rules of arithmetic.Modeling error for 4-bit CSA and 16-bit SUB when varying the signal probabilities of all input bits together.