Near-Threshold Computing and Minimum Supply Voltage of Single-Rail MCML Circuits

and


Introduction
High-speed circuits are now required in a wide range of applications such as high-speed processors and Gbps multiplexers for optical transceivers [1,2].In MOS current mode logic (MCML) techniques, the output swing of the circuits is much little than conventional CMOS ones, and thus the circuits realized with the MCML techniques can operate at a high speed [3,4].
As CMOS process technology scales, the demand for more processing results in large power dissipations.It can be shown that the power dissipations of integrated circuits will increase over time if significant changes for the circuit architectures are not made.Scaling supply voltage is an efficient technique to achieve low power-delay product (PDP) [5].The power dissipation of MCML cells is proportional to the product of their supply voltage and the biasing current, and thus it is independent of the operation frequency because of their constant biasing current.Therefore, the supply voltage of MCML circuits should be reduced as much as possible [6].However, the current almost all MCML circuits are realized with dual-rail scheme [7][8][9][10].The NMOS series configuration in the dual-rail logic circuits limits their minimum supply voltage.Moreover, the dual-rail logic circuits increase transistor counts, resulting in extra area overhead [11].A singlerail structure of MCML circuits has been reported [11].The single-rail logic circuits reduce the area overhead, and thus low delay can be expected.Moreover, the logic evaluation tree of the SRMCML cells such as AND and OR gates can be realized by only using MOS transistors in parallel.This can further reduce power dissipations because of their low source voltage [12].
In this paper, the analysis model for calculating minimum supply voltage of single-rail MCML (SRMCML) circuits is addressed, so that the minimum supply voltage of SRMCML circuits can be estimated according to the model parameters of MOS transistors.A dynamic flopflop based on SRMCML is also proposed.The performance optimization algorithm for near-threshold sequential circuits is presented to optimize and improve the speed of the SRMCML circuits.An SRMCML mode-10 counter is verified.This paper is organized as follows.In Section 2, the MCML circuits with dual-rail and single-rail structures are described, and the design methods of the basic singlerail MCML combinational logic cells are also presented.In Section 3, the analysis model for calculating minimum supply voltage of MCML circuits is addressed, and the relationship between the minimum supply voltage and the model parameters of MOS transistors is derived.In Section 4, a dynamic flop-flop based on the SRMCML is proposed, and the power dissipation, delay, and power-delay products of the proposed dynamic flop-flop are given.The performance metrics of the SRMCML circuits and the performance optimization algorithm are addressed in Sections 5 and 6, respectively.In Section 7, near-threshold SRMCML mode-10 counter is introduced, and the power dissipation, delay, and powerdelay product of the mode-10 counter using the performance optimization algorithm are compared with basic SRMCML one.Finally, the work of this paper is summarized in the last section.

SRMCML Circuits
The basic dual-rail MOS current mode logic (DRMCML) buffer/inverter with its biasing circuit is shown in Figure 1, which is composed of three main parts: the PMOS transistors  1 and  2 that are used as load resistors, the evaluation tree with full differential pull-down switch network consisting of  1 and  2 , and the biasing current source transistor  s .The load transistors are controlled by the voltage  rfp [3].The NMOS transistor  s provides the biasing current source, which is mirrored from the current source in the bias circuit.In the DRMCML, the bias circuit generates two signals  rfp and  rfn to ensure the proper output voltage swings and biasing current.
In DRMCML circuits, the pull-down network switches the biasing current between two branches, and then the loads (PMOS transistors) convert the constant current to output voltage swings.The high and low digital logic levels are  OH =  DD and  OL =  DD −     , respectively, where   is the PMOS load resistance.The logic swing is DRMCML is a differential logic with dual-terminal inputs and dual-terminal outputs.The two-input and three-input AND/NAND and OR/NOR gates based on DRMCML are shown in Figure 3.
In almost all designs, the logic swing of DRMCML circuits is taken as Δ <  th , this is because the NMOS transistor N 1 can operate at saturation region.For two-level DRMCML circuits, as shown in Figures 3(a) and 3(b), the logic transistor N 1 operates at saturation region, while the logic transistor N 2 operates at linear region.Therefore, the minimum supply voltage of two-level DRMCML circuits can been written as where  1,gs is gate-source voltage of the transistor N 1 at saturation region,  2,ds is drain-source voltage of the transistor N 2 at linear situation, and  ,sat is the drain-source voltage of the transistor N s at velocity saturation point, respectively.For three-level DRMCML circuits, as shown in Figures 3(c) and 3(d), the transistor  1 operates at saturation region, while the transistors  2 and  3 operate at linear situation.Therefore, the minimum operating supply voltage of the three-level DRMCML circuits can be written as  DD,min,three-level =  1,gs +  2,ds +  3ds +  ,sat , where  1,gs is gate-source voltage of the transistor  1 at saturation region,  2,ds and  3,ds are drain-source voltages of the transistor  2 at linear state, and  ,sat is the drainsource voltage of the transistor  s at velocity saturation point, respectively.
Obviously, the NMOS series configuration of the logic tree in dual-rail MCML circuits limits the reduction of the minimum supply voltage.The dual-rail structure increases extra area overhead, because the complex valuation tree must be used.Moreover, the dual-rail structure increases complexity of the layout place and route.
A solution for the above problems is that MCML circuits are realized with single-rail structure, as shown in Figure 2 [12].The SRMCML circuits are realized only using an NMOS pull-down network to perform the demanded logic operation.The output OUTb of the SRMCML is fed back the gate of the NMOS transistor N 2 , which is different from DRMCML circuits.The basic SRMCML gates, such as buffer/inverter, two-input XOR/XNOR, and two-input and three-input OR/NOR and AND/NAND are shown in Figure 4.
Similar to DRMCML, the valuation of SRMCML circuits is performed in the current domain.The pull-down network switches the biasing current between two branches, and then the loads (PMOS transistors) convert the constant current to output voltage swings.
As shown in Figure 2, the structure of the single-rail MCML circuits is simpler than the dual-rail ones, because only a pull-down network is demanded.Therefore, the singlerail logic circuits reduce area overhead.Moreover, from Figure 4, the multi-input OR/NOR and AND/NAND cells based on SRMCML can be realized by only using MOS transistors in parallel and thus avoid the series configuration of the logic evaluation block in the multi-input DRMCML OR/NOR and AND/NAND cells.This structure can reduce power consumption of MCML circuits because of the low source voltage.

Minimum Supply Voltage of SRMCML Circuits
Scaling down the supply voltage of SRMCML circuits can effectively reduce their power consumption, because their power dissipation is in direct proportion to the supply voltage.However, the supply voltage of the SRMCML circuits has a minimum limit, at which the biasing source transistor should operate at velocity saturation region, and the pulldown network NMOS transistors should be turn on.If the relationship between the minimum supply voltage and the model parameters of MOS transistors is derived, the minimum supply voltage of SRMCML circuits can be estimated before circuit designs.As shown in Figure 4, the almost all basic SRMCML gates use single-level configuration except for the two-input XOR/NXOR.For two-level SRMCML circuits shown in Figure 5, the minimum supply voltage can beexpressed as (1), which is the same as two-level DRMCML circuits.
When the NMOS transistor operates at velocity saturation point, we can get its drain current  ,sat and the drainsource voltage  ds,sat expressed by the gate-source voltage  gs according to the BSIM3 MOSFET model where  eff is the effective channel width of MOS device. ox is the gate capacitance per unit area, V sat is the carrier saturation velocity of the MOS device,  sat is the critical electric field, and  eff is the effective channel length of MOS device, respectively. bulk is the bulk charge effect that can be estimated from the simulation model card.If the channel length is small,  bulk is about unity, and it rises as channel length is increased.From (3), we can get  ds,sat expressed by  ,sat by eliminating parameter  gs , since a part of (4) satisfies Equation ( 4) can be simplified as Again, from (3), we get  gs expressed by  ,sat by eliminating  ds,sat Equation ( 7) can be simplified as For the hand calculation, ( 6) and ( 8) are simpler and more convenient than ( 4) and (7).
According to the BSIM3 MOSFET model, when the NMOS transistor operates at linear state, its drain current  ,lin is expressed as where  eff is the effective mobility and  ds,lin is the drainsource voltage.For the convenience of hand calculation, a part of the original equation has been omitted at the acceptable error range.From (9), we can get  ds,lin expressed by  ,lin Substituting ( 10) into (1), applying parameters to the corresponding transistors in Figure 5, substituting  2,gs with  DD,min −  ,sat , and then rearranging, we can arrive at the final equation of the minimum supply voltage of the 2-level SRMCML logic circuits which is Figure 3: Basic gates based on DRMCML.When  ,sat and  1,gs shown in (11) are estimated using ( 6) and ( 8),  ,sat should be replaced by the bias constant current   , and other model parameters should be substituted with the BSIM3 MOSFET model parameters of the corresponding transistors.
According to (11), the minimum supply voltage can be estimated.When the corresponding model parameters in (11) are substituted with actual values from the model card, the relationship of the minimum supply voltage  DD,min,two-leve and the bias current   can be got, as shown in Figure 6.
If SRMCML circuits operate at a low speed, only a small   is required.Therefore, for low speed applications, a small   can be used.From Figure 6, the supply voltage can be reduced for low speed applications, so that more power saving can be obtained.

Dynamic Flop-Flop Based on SRMCML
A common approach for realizing D flip-flop (DFF) is to use a master-slave configuration.The DFF can be realized by cascading a negative latch (master stage) with a positive one (slave stage).The structure of the DFF based on SRMCML is shown in Figure 7, which is a dynamic positive edge-triggered one based on the master-slave configuration.
As is shown in Figure 7, when Clk = 0, the input data is sampled on the node A for storage.During this period, the slave stage of the SRMCML flip-flop is in a hold mode, while the node B of the slave stage is in a high-impedance state.On the rising edge of the clock, the transmission gate (T 2 ) of the slave stage is turned on, so that the value of node A, which is sampled right before the rising edge, propagates to the output Q.The node B stores the value of the node A.
This implementation of an edge-triggered flip-flop is very efficient because it requires only very small transistors.The reduced count of transistors is very attractive for low-power and high-speed digital applications.In order to investigate the performance of the SRMCML dynamic DFF, it has been simulated using HSPICE at the 130 nm CMOS process.The power dissipation and delay of the SRMCML dynamic DFF are shown in Figure 8.
Taken as references, the power dissipation and delay of the basic SRMCML gate cells are also shown in Figure 8.In these simulations, the device size of PMOS load transistors and biasing current source NMOS transistor in the SRMCML circuits is taken with W/L = 8/10 and 16/4, and  = 65 nm, respectively.The device size of NMOS transistors of the differential pair is taken with 4/2 .The threshold voltage  th of the NMOS transistors is 0.282 V.The bias current of all the circuits is 8 A.Since the power dissipation of the MCML circuits is almost independent of their frequency, the operation frequency is taken as 1 GHz.
From Figure 8, the power dissipations and delays of all the basic SRMCML gate cells are almost the same.It can be seen that the power dissipation and delay of the proposed SRMCML dynamic DFF are only slightly larger than the basic SRMCML gate cells.
In order to investigate the performance of the SRM-CML dynamic DFF in near-threshold regions, the SRMCML dynamic DFF has been simulated using HSPICE by varying the source voltage ranging from 0.7 V to 1.3 V with 0.1 V step at the 130 nm CMOS process.The power-delay products of the SRMCML dynamic DFF are shown in Figure 9. From Figure 9, the power-delay products of the SRMCML dynamic DFF can be effectively reduced by lowering its source voltage.

Performance Parameters of SRMCML Circuits
This section focuses on the performance of the SRMCML gates as a function of numerous design parameters.In order to optimize the performance of the SRMCML gates, some metrics of performances should be determined.These performance metrics for the SRMCML gates consist of hard constraints and optimization goals.The hard constraints including gain, voltage swing ratio (VSR), and signal slope ratio (SSR) must not be violated.Optimization goals including power dissipation and power-delay product should be minimized or maximized.The performance parameters for a typical SRMCML circuit are voltage gain   , voltage swing ratio (VSR), signal  slope ratio (SSR), current matching ratio (CMR), noise margin (NM), voltage swing (ΔV), power dissipation P, delay time   , and power-delay product (PDP) [9,10,13].

Voltage Gain (𝐴 𝑉
).The voltage gain   is defined as the max voltage gain of the MCML circuits.It is a key parameter for regenerating and stability of the MCML circuit.For SRMCML,   is expressed as where   is transconductance,  OX is oxide capacitance of the transistors,   is electron mobility, and  eff and  eff are effective width and length of the transistors, respectively.It is obvious that the voltage gain   must be greater than 1 for all process and voltage deviations.A 40% margin would be sufficient for those variations in process, voltage, and matching conditions.Therefore, the lower limit of the voltage gain   in our work is set as 1.4.

Voltage Swing Ratio (VSR).
The ideal operation of MCML circuits is a perfect current switch, where all the current flows down into one branch or the other.In reality, a little amount of the biasing current flows in the "off " path, resulting in a reduction in the output voltage swing.We set this constraint that the output voltage swing must be at least 95% of the applied input voltage.

Signal Slope Ratio (SSR).
Since the speed of the MCML gate depends not only on the propagation delay but also on the output waveform shape of the previous gate, the reasonable rise and fall time must be ensured.The signal slope ratio SSR used in the work is defined as the ratio of rise and fall time ( rf ) and propagation delay (  ) This metric should be kept as low as possible.We set this constraint as an absolute limit of 5.

Current Matching Ratio (CMR).
This constraint is the current amount flowing through the actual current source in comparison to the reference biasing current source.In order to achieve design predictability, the actual current should be close to the reference biasing current.The most main parameter that affects this ratio is the output impedance of the biasing current source (the transistor   ) and its drainsource voltage.

Noise Margin (NM).
A sufficiently large noise margin (NM) in SRMCML circuits should been achieved because of reduced voltage swings.NM is given by The high noise immunity of SRMCML circuits can accept small NM values.Practically, an NM of 40% swing voltage (ΔV) is sufficient to ensure proper operation of SRMCML circuits without the performance degrading.

Voltage Swing (ΔV).
For the SRMCML circuits, the logic swing ΔV should be correctly selected.For Figure 4(a), the logic low voltage must be enough high, so that the NMOS transistors N 1 and N 2 work in the saturation state, and thus it can be written as where  TH,N 1 is the threshold voltage of the NMOS transistors N 1 and N 2 .At the same time, the logic low voltage ( DD −ΔV) must be low enough, so that the input NMOS transistor of the next SRMCML circuits can be shut down reliably where  gs,sat is the gate-source voltage of the on-turn transistor N 1 .The similar analysis can be carried out for the other SRMCML gates shown in Figure 4, and the same conclusions as (15) and ( 16) can be obtained.

Power Dissipation (P) and Delay Time (𝑇 𝑑
).Similar to the DRMCML circuits, the important performance metrics of the SRMCML gates include power consumption, propagation delay, and power-delay product.Due to the constant biasing current, for given  DD and   , the power consumption of an SRMCML gate is almost independent of the switching frequency, logic function, and fanouts.It can be written as Assuming that the whole   ideally flows through one branch of the differential pair and charges the load capacitance C of the SRMCML gate, its delay time is given by where C is load capacitance on the output node.From (18), the delay of the SRMCML gate is linearly reduced as the signal swing decreases.
The power-delay product can be calculated as

Performance Optimization Algorithm for SRMCML Circuits
The relationships between performance metric and design parameters include several aspects.(a) We should construct the mathematical model among delay time, power dissipation, and device dimension.
(b) For a given   , we need to get the operating current of the SRMCML gate, so that the value of power dissipation is the optimal one.
(c) We need to determine the device dimension by the biasing current.
The first step in the optimization algorithm is to initialize the  DD = 1.3 V.For a number of discrete values of   ranging from 0.5 uA to 100 uA, we try to find the channel widths of the PMOS load transistors, full differential pull-down network consisting of  1 and  2 , and the biasing current source  s .In the next loop in the optimization algorithm, we try to find the ΔV.For each   , we choose and fix a ΔV.In the third loop in the optimization, we choose and fix the length of the PMOS loads.Finally, within each iteration, we choose and fix the best width of the differential pull-down switch network.
This optimization procedure is illustrated in Algorithm 1.In Algorithm 1, LRFP is the length of the PMOS transistors  1 clk and  2 and  1 and  2 are the widths of the NMOS transistors  1 and  2 .By carrying out the performance optimization algorithm, we can obtain the optimal values of   , ΔV, the length of LREF of the PMOS transistors, and the width of the NMOS transistors  1 and  2 .

Simulations and Analyses
In this section, the near-threshold mode-10 counters are realized using basic SRMCML and the optimization algorithm.The influences of process and temperature variations for the SRMCML circuits are analyzed.The conventional static CMOS mode-10 counter is used for comparing the performances with SRMCML ones in terms of power, delay, and power-delay product.
The structure of the decimal counter based on SRMCML is shown in Figure 10.The decimal counter consists of NAND2, NAND3, and dynamic DFF.The device size of PMOS load transistors and biasing current source NMOS transistor is taken with W/L = 8/10 and 16/4, respectively.The device size of NMOS transistors of the SRMCML is taken with 4/2 and  = 65 nm.
The SRMCML dynamic DFF and decimal counter based on the performance optimization are stimulated by using HSPICE at the 130 nm CMOS process.The simulation frequency of all the MCML circuits is 1 GHz.
The performances of the circuits would be sensitive to process and temperature variations, especially in low supply voltages.As mentioned in Section 5, in the designs of SRM-CML circuits, the sufficient margin of the hard constraints including gain, voltage swing ratio (VSR), and signal slope ratio (SSR) must be set for those variations in process, voltage, and matching conditions.
The worst case corner simulations for the SRMCML mode-10 counters have been carried out to take process variations into account.Considering temperature variations, the circuits have also been simulated in operating temperature ranging from 4 ∘ C to 60 ∘ C. The results show that the SRMCML mode-10 counters have correct logic function for worst case corner simulation and temperature variations in the source voltages ranging from 0.7 V to 1.3 V.
According to (11), the minimum supply voltage of the SRMCML circuits depends on the model parameters of the corresponding MOS transistors.Therefore, the process and temperature variations would affect the minimum supply voltage because of the variation of these MOS transistors.The worst case corner simulations show that the process variations result in about 5.7% reduction of the minimum supply voltage of the SRMCML mode-10 counters.In the operating temperature ranging from 4 ∘ C to 60 ∘ C, the minimum supply voltage of the SRMCML mode-10 counters has about 7.6% error compared with the value estimated according to (11).
The comparison results of the power dissipation and delay of the SRMCML decimal counters are shown in Figure 11.From Figure 11(a), the power dissipations of the SRMCML mode-10 counters based on the basic SRMCML and using the optimization algorithm are almost the same.Just as shown in (17), the power dissipation of the SRMCML decimal counters decreases linearly with supply voltage scaling down.
Figure 11(b) shows that the SRMCML mode-10 counter using the performance optimization algorithm attains lower delay than basic SRMCML.According to (18), the delay of SRMCML circuits is independent of the source voltage.Since the delay of the two transmission gates (T 1 and T 2 ) in the dynamic flip-flops of the SRMCML counters increases with supply voltage scaling down, therefore, the total delay of the SRMCML mode-10 counters slightly rises with supply voltage scaling down.
For comparison, the mode-10 counter based on the conventional static CMOS using the transmission gate flipflop with master-slave construction has been also simulated at the same CMOS technology.Their power dissipations and delay have been compared with SRMCML ones, as shown in Figure 11.Just as expected, the delay of SRMCML counters is mush smaller than the conventional static CMOS one, and thus SRMCML can operate in higher speed than CMOS.
From Figure 11(a), the power consumption of SRMCML mode-10 counters is higher than the static CMOS one.Although the static CMOS mode-10 counter consumes much lower power than SRMCML ones in low source voltage, its delay rises dramatically with supply voltage scaling down.
The power-delay product metric provides a good tradeoff between power and delay features.The power-delay products   of the SRMCML and static CMOS decimal counters are compared in Figure 12.From Figure 12, it can be seen that the power-delay products of the SRMCML using the performance optimization algorithm are smaller than the basic SRMCML because of the reduced circuit delay.The SRMCML mode-10 counters attain lower PDP than conventional static CMOS, especially in low source voltages.

Conclusions
Scaling down the supply voltage of single-rail MOS current mode logic (SRMCML) circuits can effectively reduce their power consumption, because their power dissipation is in direct proportion to the supply voltage.However, the supply voltage of the SRMCML circuits has a minimum limit for ensuring the proper operation.In this work, the analysis model for calculating minimum supply voltage of SRMCML circuits is addressed.The relationship between the minimum supply voltage and the model parameters of MOS transistors has been derived, so that the minimum supply voltage of SRMCML circuits can be estimated before circuit designs.An MCML dynamic flop-flop based on SRMCML structure is also addressed in this work.The optimization algorithm for the near-threshold computing of the SRMCML circuits is proposed.Scaling down the supply voltage of the SRMCML circuits is investigated.The comparisons of power dissipation, delay, and power-delay products of these circuits are carried out.The results show that the near-threshold SRMCML circuits can obtain low delay and small powerdelay product compared with the conventional static CMOS one.

Figure 6 :
Figure 6: Minimum supply voltage of the two-level SRMCML logic circuits.

Figure 8 :
Figure 8: Power dissipation and delay of SRMCML cells.(a) Power dissipation and (b) delay.

Figure 9 :
Figure 9: Power-delay products of the dynamic DFF based on SRMCML.

Figure 11 :
Figure 11: Power dissipation and delay of the mode-10 counter based on SRMCML.(a) Power dissipation and (b) delay.

Figure 12 :
Figure 12: Power-delay products of the mode-10 counter.