Ultra-Low Leakage Arithmetic Circuits Using Symmetric and Asymmetric

We are examining different configurations and circuit topologies for arithmetic components such as adder and compressor circuits using both symmetric and asymmetric work-function FinFETs. Based on extensive characterization data, for the carry generation of a mirror full adder using symmetric devices, both leakage current and delay are decreased by 25% and 50%, respectively, compared to results in the literature. For the 14-transistor (14T) full adder topology, both leakage and delay are decreased by 23% and 29%, respectively, compared to the mirror topology.The 14T adder topology, using asymmetric devices without any additional power supply, achives reduction in leakage current by 85% with a small degradation of 7% in delay. The compressor circuits, using asymmetric devices for one of the proposed configurations, achieve reduction in both leakage current and delay by 86% and 4%, respectively. All simulations are based on a 25 nm FinFET technology using the University of Florida UFDG model.


Introduction
The demand for smaller and faster portable electronic equipment has forced the semiconductor technology to a sharp reduction in the minimum feature size from the microto the nanometer regime [1].The advancement in process technologies has paved the way to the realization of highly complex systems on a single device targeting real-time high speed applications such as wireless communication and computing.With extremely high level of integration in the nanodomain of planar CMOS technology, the subthreshold leakage current is becoming a major concern for system designers due to the reduction in the threshold voltage of devices as well as reductions in other device parameters to maintain device scalability rules.This situation becomes significant in sub-22 nm bulk CMOS technology due to its very poor channel electrostatic potential which leads to degraded short-channel behaviour and high leakage current [2].FinFETs overcome these problems with a stronger control of the channel potential by using two gates wrapped around the fin [2].Until now, several studies have been performed on FinFET logic circuits in [2][3][4] which applied back gate biasing techniques to reduce leakage current.However, much less studies and analyses have been conducted on FinFET-based arithmetic functions.In [4], a full adder based on mirror architecture, using double-gate FinFET, has been proposed and analyzed.The results from this work will be used later on in our paper as a basis for our comparison with our proposed architectures.
The goal of this paper is to develop circuit topologies and configurations that lead to high-performance low leakage arithmetic components using symmetric four-terminal FinFETs.Also, we have developed a novel approach by applying back gate biasing techniques for asymmetric Fin-FETs without using any extra power supply to achieve ultra-low leakage current, yet maintaining the high performance.Device and circuit characterization were performed in a SPICE simulation environment using the University of Florida double-gate device models (UFDG) [5], with typical 25 nm FinFET parameters.The rest of this paper is organized as follows.A brief review of four-terminal FinFET and mechanisms to control leakage current are presented in Section 2. In Section 3, different circuit topologies of symmetric and asymmetric full adder circuits are discussed.Various circuit  topologies of 3 : 2 and 4 : 2 compressors utilizing symmetric and asymmetric devices are examined in Section 4. Section 5 concludes the paper.

Four-Terminal Devices and Leakage Current Control
Four-terminal FinFETs were extensively studied and analyzed in [4,7].The front and back gates of the four-terminal FinFET (4T FinFET) can be connected in various configurations.One of these configurations is to short both gates (SG FinFET).Alternatively, a 4T FinFET can be considered as two parallel transistors, and the two gates can be driven independently as shown in Figure 1.One gate which is normally called the back gate influences the vertical field of the other transistor in the channel area, hence altering its threshold voltage.Also, it impacts the diffusion current in the subthreshold regime of operation, hence controlling the leakage current.In addition, the two parallel transistors in the 4T FinFET can be tied together to improve drivability or to form a single transistor with its gates driven independently.This will be beneficial in reducing area and power dissipation in digital circuits [6].For the device shown in Figure 1, the effective channel length and width are equal to  FIN and ℎ FIN , respectively.The device parameters used in this paper are listed in Table 1.

Impact of Back Gate Biasing on Subthreshold Leakage
Current.To demonstrate the effect of back gate biasing on the ON state current ( ON ) and on the OFF state current ( OFF ), simulations were conducted for 5 different back gate voltage biases  BG for N and PFinFETs.The results for both devices are shown in Tables 2 and 3, respectively.The first important point to note from the simulation results is that the NFinFET has a better driving capability than the P device counterpart by a factor of 8 for back gate biasing voltages  of −0.2 V and 1.4 V for the N and PFinFETs, respectively.However, the leakage current for the P device is significantly less than the N device for the same back gate biasing voltages.Table 2 indicates that for the NFinFET  OFF drops by a factor of 1.97+05 when  BG varies from 0.4 V to −0.4 V.This drop is significantly higher than when  BG is altered from For PFinFET, Table 3 indicates that  OFF improves by a factor of 177 when  BG varies from 0.8 V to 1.6 V, which is much higher than when  BG varies from 1.2 V to 1.6 V that achieved a 3.5 times reduction in  OFF .However, due to high leakage, gate voltages less than the drain voltage of 1.2 V are not practical.On the other hand, the ON current drops by a factor of 3 by changing  BG from 1.2 V to 1.6 V.
Back gate biasing technique is more beneficial for NFinFETs due to their dominance in the total leakage current in FinFET based digital circuits.On the other hand, the back gate of PFinFETs is more beneficial to use in SG configuration to achieve high driving capability and performance since the PFinFETs have lower subthreshold leakage current than their N counterparts.

Impact of Asymmetric Work Functions on Subthreshold
Leakage Current.The work function difference between the gate and the channel of a FinFET dictates the threshold voltage of the device.It is a function of the gate material and the doping concentration [7].For a double gate FinFET, the use of an asymmetric work function has been explored to have an effective control on the leakage current [2].To achieve this property is a nontrivial matter which requires a very well-controlled fabrication process, and this obviously has direct impacts on the cost.However, due to its effectiveness in controlling the leakage current, we decided to explore this avenue.
In order to demonstrate the effect of asymmetric work functions on leakage current, the N device has been characterized by changing the work functions value of the back gate in the model file in steps of 0.1 eV as shown in Table 4.The range of variations was limited to only 0.2 eV due to a model limitation.
In addition, we have characterized both N(P) FinFETs by increasing (decreasing) the work function of the back gate by 0.2 eV with respect to their symmetric values (0 N = 4.8 eV and 0 P = 4.4 eV).Figures 2 and 3 show  ON / OFF ratio as a function of  BG for symmetric and asymmetric N and PFinFETs, respectively.Results show that the ratio  ON / OFF for the NFinFET improves by a factor of 100 for a voltage of  BG varying from −0.4 V to −0.2 V compared to symmetric devices with the same back gate voltage.However, for a value of  BG greater than 0 V, this improvement has dropped to a factor of about 16 times.On the other hand, for PFinFET,  the ratio  ON / OFF improves by a factor of 100 for a value of  BG varying from 0.8 V to 1.0 V.However, for a value of  BG greater than 1.2 V, this improvement has dropped to a factor of about 60 times.The findings from simulation results at the device level dictate some design strategies to develop optimized circuit topologies.These strategies are as follows.
(1) Back gate biasing is more beneficial for NFinFETs due to their leakage current dominance.(2) The back gates of PFinFETs are short gated to their front gates to improve drivability with negligible impact on the circuit leakage current.(3) Applying asymmetric devices is an alternative design strategy to achieve significantly lower leakage current and to avoid the use of extra power supply for gate biasing.

Symmetric and Asymmetric FinFETs for Full Adder Circuit
In this section, we will present four possible configurations of full adders: mirror, 14T, transmission gate based, and PTL based.The mirror architecture is segmented into two subsections, the carry generation circuit and the complete full adder.The reason for addressing the carry generation circuit separately is for the purpose of comparison with similar work proposed in the literature.This will be followed by comparison between the circuit metrics extracted from simulation for all topologies with emphasis on leakage current.Finally, we discuss the impact of process variations on the extracted metrics.

Carry Generation Circuit in Mirror Full Adder.
In [4], authors presented a circuit topology for the FinFETbased carry generation unit.In this section, we proposed an improved version using symmetric and asymmetric work function FinFETs.We will then proceed to present an optimized structure for the full adder.The left part of Figure 4 shows the proposed mirror carry circuit.This circuit makes use of both stacking effect and back gate biasing techniques to reduce subthreshold leakage current.The modifications performed on the topology described in [4] include the reduction of the number of devices by merging parallel transistors into single transistors in the pull-up and pulldown networks.This also improves the stacking effect for these branches.The back gates of other nonparallel transistors in the pull-down network are biased to  bbn = −0.2V, while the transistors in the pull-up network are short gated.The other modification was to improve circuit drivability by not using the stacked topology for the output driver proposed in [4].
The circuit with transistor widths of 25 nm was simulated for a fan-out of 4 (FO4) using UFDG device model.One fan-out is represented by an inverter with transistors short gated; that is, both the front and back gates are tied together.Delay was estimated for the worst case scenario, while the static power dissipation is an average value for all possible input combinations applied to the carry logic section of the mirror full adder based on the research work found in [4].The transistors are initially sized with  N =  P = 25 nm.The leakage current  OFF is presented as an average for all input combinations by considering the leakage current of the input and output inverters, and the delay  P is the worst case scenario for a fan-out of four (FO4).Voltage  bbn is defined as the back gate voltage of the transistors in the pull-down network.This circuit makes use of both stacking effect and back gate biasing techniques to reduce subthreshold leakage current.However, we modified the configuration used in [4], as shown in Figure 4, by first changing the inverter structure, without applying stacking effect at the output to reduce the number of transistors and to improve driving capability.Our second modification was to reduce the number of devices by merging parallel transistors into single transistors in the pull-up and pull-down networks.This also improves the stacking effect for these branches.The front and back gates of other nonparallel transistors in the pull-up and pull-down networks are shorted together and biased to  bbn = −0.2V, respectively.We call this structure the IG/LP mode.The convention is based on [3], and the IG (independent gate) means merging each two parallel transistors into one single transistor by tying independent signals to the back gate of transistors, and the LP (low power) means tying the back gate of transistors to a static supply voltage.Our simulation results show that this configuration decreases leakage and improves speed of operation by 25% and 50%, respectively, compared to the best configuration using mixed terminal (MT) FinFETs found in [4] based on our simulation methodology and device model as shown in Table 5.Our approach reduces the number of transistors from 14 to 10.Hence, it saves overall area and power dissipation.
Our work also covers asymmetric IG/LP carry logic circuit with  bbn = 0 V. Results are shown in Table 5.This technique reduces the subthreshold leakage current by 83% with a delay penalty of 12% compared to symmetric ones with  bbn = −0.2V. Using  bbn = 0 V has the advantage of not requiring an additional power supply which leads to reduction in area, power dissipation, and cost.

Complete Mirror Full Adder.
In this section, we present characterization data for the complete full adder circuit, shown in Figure 4, using the same scenarios utilized in the carry circuit.Results in Table 6 show that by applying asymmetric techniques with  bbn = 0 V, subthreshold leakage current is reduced by a factor of 8 with a delay penalty of 6% compared to symmetric ones with  bbn = −0.2V.

Low Transistor Count Full Adder (14T Full Adder).
A small transistor count adder has been proposed based on the traditional CMOS technology in [8].However, in this paper, this topology is implemented based on the optimal configurations of the four-terminal FinFET devices.The configuration of the 14T adder shown in Figure 5 is implemented based on pass transistors with swing restoration pull-up and pull-down transistors.The feedback PFinFET and NFinFET provide full swing voltages to eliminate the low and high voltage degradation through the pass transistors.
Simulations were conducted for both symmetric and asymmetric FinFETs based on the optimal configuration of the 14T full adder circuit.Results shown in Table 7 indicate that by utilizing asymmetric devices, the circuit achieved    a significant reduction in leakage current by a factor of 7 with a delay penalty of 16% compared to symmetric ones.

Transmission Gates Full Adder (TG
).An optimal configuration of the full adder based on transmission gates is proposed, as shown in Figure 6.This configuration implements the following set of equations: V bbn   8 indicate that using asymmetric work functions reduces leakage current by a factor of 6 with a 7% improvement in delay compared to symmetric counterpart device.

Pass Transistor Logic Full Adder (PTL).
The optimal configuration of a pass transistor logic (PTL) full adder cell based on four-terminal FinFET is shown in Figure 7.This approach uses a single output NFinFET tree for each output.The adder implements the following equations: ( The main drawback with this topology is the signal degradation at the output.To fix this problem, the threshold voltage of the PFinFET of the inverter is increased by using back gate biasing voltage  bbp = 1.4 V. Simulations were conducted for symmetric and asymmetric FinFETs based on the pass transistor logic full adder cell.Results shown in Table 9 illustrate that using asymmetric devices reduces leakage current by a factor of 17 with a delay penalty of 17% compared to the symmetric cases.

Comparison of Full Adder
Topologies.In Table 10, the various topologies of the full adder circuits are compared to each other in terms of leakage current, delay, dynamic energy, and number of transistors.The data shows that the 14T topology has superior performance in all aspects compared to other full adder topologies except for the PTL topology which is faster by 6% compared to the 14T topology.The subthreshold leakage current of the 14T topology is lower by 23%, 49%, and 80% compared to the mirror, TG, and PTL topologies, respectively.In addition, the 14T topology uses 42%, 46%, and 30% less transistors than the mirror, TG, and PTL topologies, respectively.It is also the best performer in terms of dynamic energy.

Impact of Variations of Fin Geometrical Parameters and
Supply Voltage.In this section, we demonstrate the impact of the variations of fin geometrical parameters, namely, the fin height (ℎ fin ) and the fin thickness ( si ) on leakage current and performance of the 14T full adder circuit.We have also examined the impact of the supply voltage variations on the same metrics.The fin geometrical variations were selected to be ±5% and ±10% with respect to their nominal values.The simulation results shown in Table 11 reveal that by increasing/decreasing the ℎ fin by 5% and 10% with respect to its typical value (ℎ fin = 25nm), the leakage current is increased/decreased by 5% and 9%, respectively, compared to its nominal value.On the other hand, the variations on delay are significantly less than the variations on the leakage.By increasing/decreasing the  si by 5% and 10% with respect to its nominal value ( si = 14 nm), the leakage current is increased/decreased by 52% and 77%, respectively, compared to its nominal value, while the variation on delay is the same as in the case of ℎ fin .These simulation results indicate that the impact of variation of the fin thickness on subthreshold leakage current is significantly higher than the impacts of the variation of the fin height.We also swept the value of the supply voltage  DD from 0.8 V to 1.4 V to examine the impact of voltage variations on the leakage current and delay.From the simulation results presented in Figures 8 and 9, we found that the trend is similar to that of traditional planar CMOS technology.To quantify  the specific variations for our targeted FinFET process, a change of the supply voltage from 0.8 V to 1.4 V results in a variation in leakage current of −55% and +31%, respectively, from the nominal value.On the other hand, the same change in voltage results in variations of delay by +77% and −15% relative to the nominal value.

Compressor Circuits
The multioperand addition involved in the summation of partial products is one of the most expensive operations in a multiplier [9].Hence, reducing the amount of these products before the final summation is critical to achieve high computing speed.This can be achieved by cascading carry ripple adders; but this approach is not efficient in terms of area and performance.A better alternative is to reduce the partial products by using circuits known as compressors [10].
4.1.The 3 : 2 Compressor Circuits.A 3 : 2 compressor takes 3 inputs  1 ,  2 , and  3 and generates 2 outputs, the sum bit  and the carry bit , as shown in Figure 10.The compressor is governed by the following equation [9].Consider The 3 : 2 compressor can also be employed as the full adder cell when the third input is considered as the carry input from the previous compressor block or  3 =  in .
The logic decomposition of the 3 : 2 compressor shown in Figure 11 employs two XOR blocks in the critical path.The equations supporting this architecture are shown below [11]: It can be seen that the sum output is in the critical path, which has an overall delay equivalent to the summation of the delay of two XOR blocks.

MUX and XOR Cells Used in Compressors.
Three different topologies of XOR and MUX blocks are chosen from the literature based on CMOS technology.They will be implemented based on symmetric and asymmetric 4T FinFETs in this paper as described in the following scenarios.Scenario 2. The SG/LP transmission gate-based MUX block is still used.However, since the critical path which dictates the overall delay consists of only XOR blocks, and due to the fact that the overall area for whole compressor is dictated by the XOR block, it is more beneficial to use these blocks in configurations which are faster and have better overall area than the skew free topology.Hence, in this scenario, the SG/LP configuration of transmission gate topology of FinFET XOR shown in Figure 14 is replaced with the skew free counterpart.Scenario 3. In order to achieve further improvement in speed and area compared to the XOR topology utilized in Scenario 2, the XOR cell [11] shown in Figure 15 is replaced with the TG topology.This configuration of XOR is based on pass transistor.However, for the MUX cell, we kept the same topology as in Scenarios 1 and 2.
Simulations.Simulations were conducted for the 3 : 2 FinFET compressor based on symmetric 4T FinFETs with a voltage bias of  bbn = −0.2V for all three scenarios.The results listed in Table 12 show that Scenario 1 has a 58% and 44% lower leakage current compared to Scenarios 2 and 3, respectively.On the other hand, in terms of delay, Scenario 1 has a 14% and 26% higher delay compared to Scenarios 2 and 3, respectively.Concerning the overall area, Scenario 1 has a 27% and 40% higher number of transistors compared to Scenarios 2 and 3, respectively.
As discussed earlier, asymmetric devices achieved a significant reduction in leakage current with a small delay penalty for most configurations except for a few where a slight Carry Sum improvement of the delay performance was obtained.In addition, the use of these devices did not require the need for additional power supplies for back gate biasing.Simulations were also conducted for asymmetric devices for the same scenarios as in the case of symmetric devices.Results shown in Table 13 present the same conclusions as in the case of symmetric devices.However, asymmetric devices utilized in Scenario 1 significantly reduced the subthreshold leakage current by a factor of 7 with the same delay performance compared to the symmetric ones.

The 4 : 2 Compressor
Circuit.A 4 : 2 compressor has five inputs and three outputs, as shown in Figure 16.The four inputs  1 ,  2 ,  3 , and  4 and the output sum have the same weight.The output carry is weighted one binary bit order higher.The 4 : 2 compressor receives an input  in from the previous module of one binary bit order lower in significance and produces an output  out to the next compressor module [10].
The 4 : 2 compressor is governed by the following equation [11]:  The traditional implementation of a 4 : 2 compressor is composed of two serially connected full adders as shown in Figure 17.When the individual full adders are divided into their building blocks, the overall delay is equal to 4 times the delay of an XOR block [11].On the other hand, the logic decomposition of the 4 : 2 compressor shown in Figure 18 has better speed of operation since the overall delay has been reduced to the delay of 3 XOR blocks [9].
The following equations govern the outputs of this structure.Consider Simulations were conducted for the 4 : 2 FinFET compressor based on symmetric 4T FinFETs with a voltage bias of  bbn = −0.2V for the three scenarios introduced in the last section.Results listed in Table 14 show that for the 4 : 2 FinFET compressor, Scenario 3 has a superior performance in leakage current and delay.It has 2% and 36% less leakage current compared to Scenarios 1 and 2, respectively.In the case of speed, its critical path delay is 32% and 11% lower when compared to Scenarios 1 and 2, respectively.In terms of transistor count, Scenario 3 has savings of 40% and 18% compared to Scenarios 1 and 2 respectively.Simulations were also conducted for asymmetric devices with a bias voltage of  bbn = 0 V for the same scenarios as in the case of symmetric devices.Results shown in Table 15 indicate that for asymmetric devices Scenario 3 still has superior performance for all metrics.However, asymmetric devices used in Scenario 3 reduced subthreshold leakage current significantly by a factor of 6 with negligible reduction in speed of 4% compared to symmetric ones utilized in the same scenario.

Conclusion
In this paper, four-terminal FinFETs have been extensively analyzed with the goal of reducing subthreshold leakage current.We applied both back gate biasing and asymmetric work functions, which are two effective methods to achieve ultra-low subthreshold leakage current level in FinFETs.We have used these powerful techniques to design optimized circuits for arithmetic components, namely, a full adder and compressor circuits in different configurations.Our simulation results show that by applying asymmetric work functions, the subthreshold leakage current can be reduced significantly with low delay penalty and we can also avoid the use of additional power supply.However, one must also consider that asymmetric circuits are more costly to fabricate since careful adjustment of the doping profiles is required for both sides of the same FinFET.

Figure 4 :
Figure 4: (a) The IG/LP mode of the symmetric mirror full adder, (b) the IG/LP mode of the asymmetric mirror full adder.

Figure 5 :
Figure 5: (a) Optimal configuration of the symmetric 14T full adder circuit, (b) optimal configuration of the asymmetric 14T full adder circuit.

Figure 6 :
Figure 6: (a) Optimal configuration for symmetric TG full adder, (b) optimal configuration for asymmetric TG full adder.

Figure 8 :
Figure 8: Impact of voltage variation on leakage current for the 14T full adder.

Figure 9 :Figure 10 Figure 11 :
Figure 9: Impact of voltage variation on delay for the 14T full adder.

Table 4 :
Impact of asymmetric work functions on leakage current for N FinFET.
ON / OFF ratio for the 4T NFinFET.

Table 5 :
Results for the carry circuit.

Table 6 :
Results for the complete mirror full adder.

Table 7 :
Results for the 14T full adder.

Table 8 :
Results for the TG full adder.

Table 9 :
Results for the PTL full adder.

Table 10 :
Comparison of full adder circuits.

Table 11 :
Impact of the fin height and fin thickness on delay and leakage current in 14T full adder.