On Improving the Performance of Dynamic DCVSL Circuits

,


Introduction
Digital design space is filled with a variety of logic styles suitable for different applications [1][2][3][4][5][6].Conventionally static CMOS has predominance over remaining styles due to the low static power consumption.Differential Cascode Voltage Switch Logic (DCVSL) [6] is a static style which is beneficial from circuit delay, layout density, logic flexibility, and power consumption.The DCVSL has been employed to develop various circuits for fault tolerance [7], ternary logic [8], micro pipelining [9], delay cell [10], ring oscillator [11], capacitor neutralization [12], and so forth.It is well known that static logic styles suffer from high power consumption when output switches its logic state, a situation which worsens with increasing clock frequencies.The performance can be improved by using the dynamic version of DCVSL which is based on precharge-evaluation logic.Many clocked versions of DCVSL style named dynamic DCVSL (Dy-DCVSL) and enhanced DCVSL (EDCVSL) [13] are presented in the literature.As the speed of the dynamic circuit depends on logic tree depth and width [13], this paper proposes an architecture to reduce logic tree depth by employing transmission gates in logic function realization.Apart from the speed issue, leakage currents are yet another concern that shows predominance in submicron technologies.Leakage loss occurs when the output is stable (i.e., low output or high output).A new architecture incorporating the leakage control technique [14] in dynamic DCVSL circuits is put forward, which reduces leakage current.The features of the former architectures are combined to present a third architecture.
The paper is arranged in five sections including the present one.Section 2 briefly presents existing dynamic DCVSL and EDCVSL circuits.Section 3 describes the proposed dynamic DCVSL circuits.The functional verification and performance of the proposal are placed in Section 4 and conclusions are drawn in Section 5.  transistors M1 and M2 OFF.One of the output nodes attains a low logic level depending on the value of inputs.

Dynamic Differential Cascode Voltage Switch Logic Family
Cross-coupled transistors M3 and M4 help output to switch and transistors M5 and M6 accelerate this process [13].The circuits of two-and three-input Dy-DCVSL XOR-XNOR gates are shown in Figures 1(b) and 1(c) as illustrations.

Conventional EDCVSL Architecture.
The generic architecture of the EDCVSL circuits is drawn in Figure 2(a) and two/three-input EDCVSL XOR-XNOR gate is shown as an example in Figures 2(b) and 2(c).It is similar to Dy-DCVSL except for PDN where only one logic tree branch (OUT) is retained and the other is replaced by two stacked transistors M7 and M9.The gate of transistor M7 is connected to CLK signal so it is activated in the evaluation phases only.The transistor M9 is connected to the output of the logic tree branch to achieve complementary operation.To elaborate the behavior, consider schematic of Figure 2(b).In the precharge phase, both outputs are precharged and the NMOS cross-coupled transistors M5 and M6 are ON but have no effect on output as M8 is OFF.In the evaluation phase, for low (high) values at both inputs, the transistors M12 and M13 (M10 and M11) are ON and therefore transistor M9 is OFF.So, OUTB remains high and M5 retains its ON state and finally OUT becomes low.For the case when one input is high and the other is low, no path exists between OUT and the ground so it remains high and OUTB goes low.The operation of three-input EDCVSL XOR-XNOR gate is similar to two-input EDCVSL XOR-XNOR gate and is omitted for the sake of brevity.

The Proposed Dynamic DCVSL Circuits
This section presents three new architectures to improve performance of existing Dy-DCVSL and EDCVSL.The first architecture aims at speed improvement, the second works on leakage reduction, and the third combines features of the first two architectures to see their combined effect on performance.

Proposed Architecture-1 (PA-1).
Proposed architecture-1 is based on shifting the function realized by PDN logic tree to a separate block and using transmission gates (TG) logic for its implementation.The new Dy-DCVSL and EDCVSL circuits based on architecture-1 are named Dynamic TG based DCVSL (Dy-TG-DCVSL) and Dynamic TG based EDCVSL (TG-EDCVSL) circuits, respectively.A generic architecture of Dy-TG-DCVSL circuit along with two-and three-input XOR-XNOR realization is depicted in Figure 3.The PDN logic tree consists of two NMOS transistors which are controlled by the outputs of two separate blocks.The two blocks generate the complementary outputs such that either M9 or M10 is ON during evaluation.
The working of the proposed Dy-TG-DCVSL XOR2 gate can be explained for the two phases.The operation in precharge phase is the same as conventional Dy-DCVSL.Any changes in the inputs A and B may update the gate potential of M9 and M10 but will not affect the output, since M8 is OFF.Consequently, when CLK becomes high, the output gets evaluated according to the gate potential of M9 and M10.In comparison to the conventional Dy-DCVSL XOR-XNOR gate (Figure 1(b)), there is a speed advantage in terms of evaluation time due to the fact that the intermediate computation of the function is completed in the separate blocks just prior to the start of the evaluation phase.Also, a closer examination of the proposed architecture reveals a unique advantage of maintaining a constant evaluation time irrespective of the realized functionality.Similarly, placing logic functionality of EDCVSL in separate block logic leads to the proposed TG-EDCVSL architecture.Generic gate structure, two-and three-input XOR-XNOR gates are depicted in Figure 4.

Proposed Architecture-2 (PA-2).
The differential nature of the DCVSL logic style has several advantages but in submicron regions it needs attention.For all input combinations, one of the two logic tree branches in the PDN will be conducting while the other would remain nonconducting.The nonconducting branch in submicron regions would have some amount of current due to OFF transistors in the path in both precharge and evaluation phases.This current can be classified as leakage current.To improve the performance in submicron region, these currents need to be minimized.Various leakage reduction techniques based on the use of sleep transistor [15] and high threshold voltage transistors [16] are available for static DCVSL circuits.These techniques require either routing of sleep signal [15] or a complex algorithm for selection of high threshold voltage transistors.A selfcontrolled technique named LECTOR [14] is presented for CMOS circuits, which reduces both types of currents and is adapted for dynamic DCVSL circuits, and the resulting topology is referred to as proposed architecture-2 (PA-2).LECTOR technique introduces two leakage control transistors (PMOS and NMOS) in between the PUN and the PDN of the logic gate with the gate terminal of each of the leakage control transistors (LCTs) controlled by the source of the other.This arrangement ensures that one of the LCTs is always in the "near-cut-off region" for any possible input combination.This results in an increase in the resistance of the path from the power supply to the ground, leading to a substantial drop in leakage currents through the path [14].A further modification in achieving much more leakage control is to use high th LCTs.The architectures incorporating LECTOR technique in Dy-DCVSL and EDCVSL circuits are shown in Figure 5.The proposed architectures add four high th LCTs (LCT1-LCT4) in the basic architectures of the Dy-DCVSL (Figure 1(a)) and EDCVSL (Figure 2(a)) circuits and are called Dy-DCVSL-LCT and EDCVSL-LCT, respectively.
To understand the leakage control mechanism in Dy-DCVSL-LCT and EDCVSL-LCT circuits, the operating regions of LCTs during precharge and evaluation phases are examined.In the precharge phase, both the OUT and the OUTB are at  DD .Under this condition, it can be observed that the transistor LCT2 is ON and LCT1 is in near-cut-off state.Thus, LCT1 offers more impedance along the path and reduces the leakage current.Similar behavior can be observed when a high voltage is obtained at the output in the evaluation phase.

Simulation Results
This section presents the simulation results for the new Dy-DCVSL circuits based on the proposed architectures.The simulations are performed using Symica tool and the PTM technology parameters for 90 nm, 65 nm, and 45 nm nodes.The frequencies of the inputs CLK, A, B, and C are 50 MHz, 25 MHz, 12.5 MHz, and 6.25 MHz, respectively.FO4 of inverters is maintained as the load in all the gates.The results are categorized into three sections according to the proposed architectures.The leakage power is computed on the basis of leakage current and the power supply.

Simulation Results with PA-1. Dy-TG-DCVSL and TG-EDCVSL based two-input AND-NAND (AND-NAND2), three-input AND-NAND (AND-NAND3
), two-input exclusive-OR (XOR-XNOR2), and three-input exclusive-OR (XOR-XNOR3) circuits are simulated using 90 nm CMOS technology parameters.The simulation waveforms of the Dy-TG-DCVSL and TG-EDCVSL XOR-XNOR2 and XOR-XNOR3 gates are shown in Figure 7.For all the gates, it can be observed that, for low value of the CLK signal, both output nodes are precharged to  DD (=1.8 V).The voltage changes in the input signals A and B during this phase do not affect the potential of the output nodes.In the evaluation phase, for the same value of the inputs A and B (Figure 7(a)), the output node OUT remains low.Similarly, when the inputs differ, a high voltage level is obtained at the output node OUT.Thus, the XOR functionality is achieved in the proposed Dy-TG-DCVSL and TG-EDCVSL XOR2 gates.The simulation waveform for Dy-TG-DCVSL and TG-EDCVSL XOR-XNOR3 gate is shown in Figure 7(b).Similar waveforms were achieved for the other gates and are omitted for the sake of brevity.
The gates are also designed in Dy-DCVSL and EDCVSL styles to analyze the speed advantage.The corresponding delay results are noted and enlisted in Table 1.The results clearly indicate the speed advantage of PA-1 based gates over the conventional counterparts.Also, the TG-EDCVSL gates are the fastest among all the logic styles.Lastly, the PA-1 based gates show almost equal delay values irrespective of the implemented functionality.

Simulation Results with PA-2.
In this category, the leakage current reduction through incorporation of LECTOR technique in dynamic DCVSL circuits is demonstrated.An XOR-XNOR2 gate is chosen as the test bench due to its wide range of applications.The conventional Dy-DCVSL, EDCVSL, Dy-DCVSL-LCT, and EDCVSL-LCT XOR-XNOR2 gate circuits are simulated at various submicron technology nodes such as 90 nm, 65 nm, and 45 nm.Table 2 lists the leakage power for the conventional Dy-DCVSL, EDCVSL, Dy-DCVSL-LCT, and EDCVSL-LCT XOR-XNOR2 gate with  DD = 1.2 V.
The following observations are made from Table 2: (1) The percentage reduction ranges in the leakage power are 30.4%-56.6% for 90 nm, 32.2%-61% for 65 nm, and 33.8%-74.3%for 45 nm.(2) Leakage power tends to follow an increasing trend with the scaling down of the technology.
(3) An increase in percentage reduction is seen as we dig down the lower technology nodes.

Simulation Results
with PA-3.Dy-TG-DCVSL, TG-EDCVSL, Dy-TG-DCVSL-LCT, and TG-EDCVSL-LCT twoinput XOR2 and three-input XOR3 gates are simulated at various technology nodes.Out of the two dynamic styles, the results pertaining to EDCVSL circuits are listed in Tables 3-5 for the leakage power and delay measurements.The delays reported in Table 5 are for 45 nm technology node.The findings can be summarized as follows: (1) A percentage reduction range of 27%-64% for 90 nm, 30%-66% for 65 nm, and 38%-76% for 45 nm in leakage power is observed.(2) The TG-EDCVSL-LCT XOR-XNOR2 gate shows less leakage power with respect to Dy-TG-DCVSL counterpart.(3) Leakage power tends to follow an increasing trend with the scaling down of the technology.(4) The percentage reduction in the leakage power increases with the lower technology nodes.(5) The delay of the TG-EDCVSL-LCT XOR-XNOR2 gate is more than the Dy-TG-DCVSL due to the presence of the high resistance path for leakage current reduction, thus exhibiting a trade-off between the speed and leakage power reduction.

Conclusion
In this paper, three new architectures to enhance the performance of Dy-DCVSL and EDCVSL are proposed.The first    irrespective of the implemented functionality.A maximum leakage power reduction of 78.43% is achieved with the second architecture based DCVSL gates.An increasing trend in the leakage power with the scaling down of the technology is observed in the proposed circuits.Lastly, the third architecture achieves the same leakage power reduction values as the second one but is not able to exhibit the same speed advantage as achieved with the first architecture.
a) Output of the three-input Dy-TG-DCVSL based XOR gate Output of the three-input TG-EDCVSL based XOR gate

Table 5 :
Delay measurement for the PA-1 based and the PA-3 based ED-CVSL XOR/XNOR gate.