A design scenario examined in this paper assumes that a circuit has been designed initially for high speed, and it is redesigned for low power by downsizing of the gates. In recent years, as power consumption has become a dominant issue, new optimizations of circuits are required for saving energy. This is done by trading off some speed in exchange for reduced power. For each feasible speed, an optimization problem is solved in this paper, finding new sizes for the gates such that the circuit satisfies the speed goal while dissipating minimal power. Energy/delay gain (EDG) is defined as a metric to quantify the most efficient tradeoff. The EDG of the circuit is evaluated for a range of reduced circuit speeds, and the power-optimal gate sizes are compared with the initial sizes. Most of the energy savings occur at the final stages of the circuits, while the largest relative downsizing occurs in middle stages. Typical tapering factors for power efficient circuits are larger than those for speed-optimal circuits. Signal activity and signal probability affect the optimal gate sizes in the combined optimization of speed and power.

Optimizing a digital circuit for both energy and performance involves a tradeoff, because any implementation of a given algorithm consumes more energy if it is executed faster. The tradeoff between power and speed is influenced by the circuit structure, the logic function, the manufacturing process, and other factors. Traditional design practices tend to overemphasize speed and waste power. In recent years power has become a dominant consideration, causing designers to downsize logic gates in order to reduce power, in exchange for increased delay. However, resizing of gates to save power is often performed in a nonoptimal way, such that for the same energy dissipation, a sizing that results in better performance could be achieved.

In this paper, we explore the energy-performance design space, evaluating the optimal tradeoff between performance and energy by tuning gate sizes in a given circuit. We describe a mathematical method that minimizes the total energy in a combinational CMOS circuit, for a given delay constraint. It is based on an extension of the Logical Effort [

In trading off delay for energy, we are interested only in a subset of all the possible downsized circuits2014those implementations that are energy efficient. A design implementation is considered to be energy efficient when it has the highest performance among all possible configurations dissipating the same power [

Energy efficient curve. Although implementations 0 and 0′ of the given circuit have the same delay (

Zyuban and Strenski [

As shown in [

In [

The focus of this paper is on the conversion to low power of circuits that were optimized only for speed during their initial design process. Optimal downsizing is applied to each gate for each relaxed delay target, such that the whole energy efficient curve is generated for the circuit. Note that the gate sizes are allowed to vary in a continuous manner between a minimum and a maximum size. While the resultant gate sizes would be mapped into a finite cell library in a practical design, the continuous result for some basic circuits provides guidelines and observations about CMOS circuit design for low power.

The rest of this paper is organized as follows: The design scenario is described in Section

Typically, an initial circuit is given, where speed was the only design goal. In order to save energy, the delay constraint is relaxed, and the gates sizes are reduced. For example, consider Figure

To calculate the energy gain achievable by relaxing the delay by

For example, assuming that the initial design point in Figure

EDG and hardware intensity. Note that when (

Resizing of the gates to tradeoff performance with active energy is the most practical approach available to the circuit engineer. Continuous gate sizes has been used for optimizing delay under area constraints and vice versa [

In the following sections, we set up an optimization framework that maximizes the energy saving for any assumed delay constraint in a given combinational CMOS circuit. It determines the appropriate sizing factor for each gate in the circuit. For primary inputs and outputs of the circuit we assume that fixed capacitances. Given activity factor and signal probability are assumed at each node of the circuit. The result of this optimization process is equivalent to finding the energy-efficient curve for the given circuit.

The optimization problem we solve is defined as follows. Given a path in a circuit with initial delay (minimum or arbitrary)

For a given path (Figure

Example path. Each gate is assigned with logical effort notation, initial input capacitance (

The following properties are defined:

number of inputs to gate

activity factor (switching probability) of input

output activity factor of gate

logical effort of gate

parasitic delay of gate

initial capacitance of gate

off-path constant capacitance driven by gate

the average leakage power for gate

sizing factor for gate

The switching energy of a static CMOS gate

Without loss of generality, we assume that the first input of each gate resides on the investigated path. We assume that the inputs of the gates we deal with are symmetrical (input capacitance on each input pin is equal) and the gates are noncompound (i.e., gates implementing functions like

The output capacitance of a gate is defined to be its self loading and is combined mainly of the drain diffusion capacitors connected to the output. The parasitic delay of gate

We can now rewrite (

Besides the gates in the path, we have to take into account the

Substituting input

By defining

The initial

The leakage energy of a static CMOS gate

By dividing the leakage energy by

The initial

By combining (

The energy decrease rate (

In order to estimate the upper bound of

By using (

When using the logical effort notation, the path delay (

And therefore, the delay increase rate (

Given a delay value that is

From (

To get a canonical constraint goal, in which the constraint is less than or equal 1, we rearrange (

We now can use (

So the equivalent convex optimization problem (which can be solved using convex optimization tools) is:

The convexity of (

This result can be extended to handle circuit delay, instead of a single path delay. All paths must be enumerated, and the optimized delay should reflect the critical path delay. The critical path delay is calculated as the maximum delay of all enumerated paths. However, the MAX operator cannot be handled directly in geometrical programming, since it produces a result which is not necessarily differentiable. Boyd et al. [

In the following sections, we employ this procedure to characterize the EDG and power reduction in typical logic circuits, and derive design guidelines.

We run numerical experiments that explore the EDG of some basic circuits. We use GGPLAB [

Consider a chain consisting of

Inverter chain—consists of

Figure

Inverter chain—various loads (

Energy delay gain, active dominant circuit

Figure

Inverter chain—sizing of the stages in an inverter chain. (a) Stage capacitance (chain of 6 inverters), for various delay increase rates (log scale). (b) Stage capacitance (chain of 16 inverters), for various delay increase rates (log scale). (c) Stage sizing factor (chain of 6 inverters): ratio of gate capacitance to minimum delay capacitance, needed to meet the given delay increase rates value. (d) Stage sizing factor (chain of 16 inverters): ratio of gate capacitance to minimum delay capacitance, needed to meet the given delay increase rates value. (e) Stage downsizing value: change in gate capacitance with respect to minimum delay sizes to meet the given delay increase rates value. (f) Stage electrical effort (h), for various delay increase rates.

The optimization process leads to increasing the electrical effort of the last stages and decreasing the electrical effort of the first stages, to meet the timing requirements (Figure

Both Figures

Uniform versus. Optimal downsizing. Linear downsizing of an inverter chain in order to save energy by increasing the delay results in a nonoptimal design—in this case 7% more energy could be saved by tuning the sizing correctly.

Most of the energy in the path is dissipated in the last stages of the chain, where the fanout factors are larger, in order to drive the large fixed output capacitance.

Figure

Inverter chain—variable Length: the chain length is varied in order to save a maximal amount of energy for each delay value.

Figure

Inverter chain—comparison between energy delay gain and

EDG value of various inverter chains at delay increase rate of 10%

The more active a gate is, the more energy it consumes. In order to trade off delay and energy better, active gates in the timing critical path can be downsized more than inactive gates in the critical path. For instance, consider the circuit in Figure

Activity effect on sizing the path from a to end is timing critical, and the activity of input b is varied.

When the delay constrains of the circuit are relaxed, As _{nand}, we expect that the gates that are driven by the NAND gate will get downsized at the expense of the gates driving the NAND gate. Figure

Sizing factor to achieve 20% delay increase as

A similar observation holds for leakage dominant circuits, where the signal probability becomes the affecting parameter instead of the activity factor.

Sizing of 6-stages inverter chain as a function of SP the sizing of the stages is sensitive to the signal probability changes, both for small and large delay increase values.

In order to validate the correctness of the EDG optimization algorithm, the results of Section

Figure

Comparison of run time-simulation-based and analytical model optimization. The table compares the amount of time taken in order to generate an EDG plot-consists of ten delay increase points.

Circuit | Sim-based optimization | Analytical model optimization |
---|---|---|

4-long Inverter Chain | 240 sec | 25 sec |

8-long Inverter Chain | 360 sec | 40 sec |

15-long Inverter Chain | 1100 sec | 70 sec |

Simulation optimization of inverter chain with comparison to theoretical computation.

The analytical model was calibrated by computing the parasitics delay of an inverter (

We have presented a design optimization framework that explores the power-performance space. The framework provides fast and accurate answers to the following questions.

How much power can be saved by slowing down the circuit by

How to determine gate sizes for optimal power under a given delay constraint?

We introduced the energy/delay gain (EDG) as a metric for the amount of energy that can be saved as a function of increased delay. The method was demonstrated on a variety of circuits, exhibiting good correlation with accurate simulation-based optimizations. We have shown that around 25% dynamic energy can be gained when the delay constraint is relaxed by 5% in an optimal way, for circuits in 32 nm technology which were initially designed for maximal operation speed. An upper bound of power savings in a given circuit can be obtained without optimization, in order to quickly assess whether a downsizing effort may be justified for the circuit.

The method described in this work can be used by both circuit designers and EDA tools. Circuit designers can increase their intuition of the energy-delay tradeoff. The following rules of thumb can be derived from the experiments.

Minimum Delay Is Power Expensive. By relaxing the delay, significant amount of dynamic energy could be saved. We have shown that under given conditions, for a 2-bit multiplexer up to 40% of dynamic energy could be saved when the delay constraint is relaxed by 10%.

A fixed Uniform Downsizing Factor for all Gates in the circuit would lead to an inefficient design in terms of energy. The optimal downsizing factor is not uniform.

Increase delay by downsizing the “middle” gates. In order to save energy with minimal impact on timing-the gates located in the middle (between he input and the load) are downsized the most. The downsizing factor increases as the delay constraint relaxes.

Increase Delay by Increasing the Electrical Effort towards the load. Minimum delay design requires a constant tapering factor. Typically, a “fanout of 4” is used [

Downsizing of the Gates Reduces Both Dynamic energy and Leakage Energy Dissipation. Both dynamic energy and leakage energy dissipation depend linearly on the size of the gates. By downsizing the gates, both dynamic and leakage energy are reduced.

The Power Optimization Has to Be Performed under a Given Work-load. The activity factor and signal probability influence the optimized circuit's sizing. Different tests may result in different sizing. Using random tests, rather then typical tests to optimize the circuit may lead to sub-optimal design.

The authors would like to thank Yoad Yagil for his valuable inputs.