Reconfigurable Shift Switching Parallel Comparators

This Article is brought to you for free and open access by the Computer Science at ODU Digital Commons. It has been accepted for inclusion in Computer Science Faculty Publications by an authorized administrator of ODU Digital Commons. For more information, please contact digitalcommons@odu.edu. Repository Citation Lin, R. and Olariu, S., "Reconfigurable Shift Switching Parallel Comparators" (1999). Computer Science Faculty Publications. 38. https://digitalcommons.odu.edu/computerscience_fac_pubs/38


INTRODUCTION
Recently, shift switch logic (defined by state signals and shift switches, refer to [6-9] and see below) has been proposed.It is shown that the new logic method is an efficient alternative to the traditional switching logic (i.e., binary signals and logic gates) for the designs of a number of arithmetic devices including parallel counters, multipliers, and fast adders [5][6][7][8][9][10].
In this paper we propose novel VLSI asynchro- nous comparator schemes based on shift switch logic and traditional (precharged) CMOS domino logic [2, 4, 13, 14], which we call shift 1switching with domino logic.The technique allows to broadcast state signals (on short reconfigurable buses [3, 7, 10] containing cascaded shift switches) in a form of a chain of pulling-down and/or pulling-up the precharged switches.It possesses the following attractive features: (1) The compara- tors produce two-bit output xy which represents the comparison result for two inputs a and b, such that xy 01, 10, 11 indicating a > b, a < b, a b respectively, and if the result is not ready xy 00.The comparators also produce (in parallel) a bit semaphore to indicate the end of each domino propagation, and the circuits can run asynchronously without high frequency clocking schemes; (2) The comparison algorithms consist of two phases, precharge and evaluation (with single *Corresponding author, e-mail: olariu@cs.odu.educontrol signal called P/E).The precharge starts when PIE 0 and is done in parallel (in less than a full adder delay).The evaluation begins when P/ E and it consists of several domino dischar- ging processes (2 for 32-bit); (3) It is faster than traditional designs [11][12][13][14], since the full inherent speed of the computation can be utilized; and (4) Due to that a comparator delay can be determined by individual input data instead of the worst-case input, for a large percentage of inputs, the computation is significantly faster than the worst case delay, also the comparator is not delay sensitive.Moreover, the proposed method is applicable to other asynchronous arithmetic cir- cuits, such as fast adders [8].
The remainder of the paper is organized as follows: Section 2 defines state signals and GP switches.Section 3 presents a 7-bit comparator.Section 4 presents a 32-bit comparator.if u 0 (1).A state signal may be denoted by I(w) regardless of type, or I (value only).DEFINITION 2 A degenerate state signal with value !(0 _< I_< w-l) is state signal ! with the 0-th bit removed, thus represented by bl, bz,...,bw-1.
(Fig. 2), and its value is 0 if there is no unique bit in the sequence, otherwise it is equal to the corresponding state signal (with bit 0 added).It can be seen as a special state signal, and is said n- (p-) type denoted by l(w,)(I(),)ifu-0 (1).DEFINITION 3 (shift switch in general) Given function F(X, Y) (U, V), or F(X, Y) U, where X, Y, U and/or V are (small) state signals, a pass- transistor (or transmission gate) based digital filter which implements function F by shifting (in some way) the input state signals to obtains the outputs is called a shift switch of function F.
In this paper we consider only a specific type of shift switches called GP switch.

STATE SIGNALS AND SHIFT SWITCHES
Shift switch logic is defined by state signals which represent small integers and shift switches which are basic logic units to operate the state signals (refer to [6][7][8][9]).To be self-contained we show the relevant definitions below.

DEFINITION
A state signal with value I (0 < I_< w-1; w >_ 2) is represented by bit sequence bo, bl,..., bw-i (here we assume the order is either right to left or top-down) with the unique bit u (either 0 or 1) in the/-th position (see Fig. 1).A state signal is said n-(p-) type denoted by l(w) (I()) propagate) shift switch.Figure 3 shows an example of such a switch.DEFINITION 5 If a GP shift switch has an additional input called PIE (precharge/evalua- tion), such that, P/E, X, Y and U are defined by Table I The examples of GP1 and GP2 are depicted in  ---------I---  ----> o--->   The significance of using state signals and shift switches lies in the fact that some basic arithmetic and logic operations (such as adding two small integers, and the evaluation of logic functions like GP) can be done by shifting one state signal X in a way controlled by the other state signal Y.The operations can be efficiently performed by shift switches, essentially letting X pass through a small number of pass-transistors (or trans-gates) that are preset by Y. Also since both shifting and control signals are state signals, we can use the shift-out of a switch to control another switch, thus we may organize an efficient network of shift switches (or enhanced reconfigurable buses) for a desired arithmetic computation.

THE 7-BIT COMPARATOR:
A PRECHARGED CMOS SCHEME A linear array of the proposed GP switches, each associated with a CMOS domino logic unit, called comp unit, can implement a VLSI-efficient small- size comparator, such as a 7-bit comparator.Figures 5 and 6 show the evaluation phases of a comp unit for bit j (for <j < 6), and a comp unit for bit 0 respectively.The precharge and evalua- tion are two phases of the computation on each of these logic units.During the precharge phase the control signal PIE (precharge/evaluation, which may be received from a previous asynchronous device) is set to 0. It is easy to verify that a comp unit is precharged as follows: g ly (generate-l), g0y (generate-0), and py (propagate) are all low (for <j < 6); two output bits (gl0, g00) of comp unit 0 are all high.When PIE is set to 1, the evaluation phase begins.The logic units are discharged as follows: For comp units j (1 <j < 6): glj (or g0y) is discharged to high if aj > bj (or by> ay); pj is discharged to high if aj by; note that state signal cb(j)o (gl,gO,p) is called (the j-th) comp-bit state signal.For comp unit 0, gl0 (or g00) is discharged to low if a0 < b0 (or b0 < a0).
Figure 7 shows the 7-bit comparator scheme.It consists of six GP1 switches connected with seven comp units (including comp unit 0).When signal PIE is set to 0 (PIE is set to 1), all GP1 switches and comp units are precharged as described above.The precharge phase is as the follows: each propagation gate in GP1 is off and will be kept off if the corresponding comp unit bit evaluation signal cb(j)o (glj, gOj,pj) (0, 0, 1)(1 <j < 6).
This ensures a stable discharge during the evalua- tion phase, i.e., the domino discharging can  FIGURE 4 Precharged CMOS schematics of GP1 and GP2.Note that PIE is the enable signal with values 0 and for precharge and evaluation respectively.
The evaluation phase is as follows: First, the discharge of each comp unit j generates evaluation output signal cb(j)()= (glj, gOj,pj) (see Fig. 5) which then discharges the corresponding switch GP1 if cb(j)() O, otherwise it turns on the corresponding propagation gates of the GP1 for propagation of the comparison result of lower bits.
The bus (with GPls) is then partitioned (by the propagation gates), and the worst case of comp evaluation signal propagation (or discharging) will start from gate A and/or B passing all 6 GPls to produce cgO.The worst case delay of the comparator is Tc (time to discharge a comp unit) + 7T6p (time to discharge seven cascaded pass-transistors including A or B)+ Tinv (an inverter delay, low to high).The best case delay of the comparator is Tc + Tae +Tinv.
4. THE 32-BIT SCHEME: SHIFT SWITCH WITH DOMINO LOGIC A 32-bit comparator can be constructed by several GP switch blocks and organized in two levels (see Fig. 8).The first level consists of six blocks:- blockA(0), i.e., the 7-bit comparator and blockA(i) (for < < 5), The second level is a single block termed blockB.
Figure 9 shows blockA(/) (for 1<i<5).It contains five cascaded GP2 shift switches and five comp units.It produces state signal cg(i)= (ggl, ggO, pg) which represents the group's comparison result, i.e., values 2, 1, 0 representing the relation- ship between the corresponding bit segments of a and b for >, < and =, and is called (the i-th) comp-group output (state) signal.In other words, ggl 1 (or ggO 1, or pg 1) if the group's bit segment of a is greater than (or less than, or equal to) the bit segment of b.The worst case and the best case delays of the first level are dominated by the longest blockA(0), i.e., T + 7Tae + Tiny and Tc + Tae +Tinv respectively.Figure 10 shows the second level block, i.e., blockB.It is similar to blockA(0), except the follows: (1) it has five cascaded GP1 switches and its five vertical inputs are comp-group output signals from blockA(/) (for <i< 5), and its horizon- tal input (for evaluation) is the degenerate comp-group output cgO(3_), of blockA(0); (2) its two outputs are a 2-bit degenerate comp all- groups output signal cg_all(3_), and a sema- phore.The 2-bit signal cg_all(3 ), which is the final comparison result, has a Talue either 0, or 1, or 2, i.e., 11, 10 or 01, which represents a b, a < b and a > b respectively.If cg_all(3), is 00 the semaphore is 0, meaning the resulf is not ready, otherwise the semaphore is 1, meaning the result is ready; semaphore 0 (,,, (a) precharge phase (note: the values of inputs are "don't care") semaphore eg5 () 1-blookA(5) H blookA (4)   P/E '1' 1 a27-31, b27-31 a22-26, b22-26 a17-21, b17-21 aO-6, bO-6 a12-16,b 12-16 a7-  The output bits and the semaphore are pro- duced at about the same time.This is the unique feature of our scheme.More significantly, the best case delay of the comparator can be specified as 3 times of a cascaded pass-transistors delay plus two times of an inverter delay (or 3 TGp + 2Tiny), the worst and the average case delays of the compara- tor can be specified as 13Tap + 2Tinv and 6Tae + 2Tiny respectively.This implies that the asynchronous scheme can be adapted to signifi- cantly reduce average case delay of a comparator (by automatically choosing single or double instruction cycles for the computation).This will lead to an expected 60% faster for 50% inputs, or 30% faster for other 50% inputs (by a simulation program).Note that if a synchronous comparator is preferred we can modify the first level of GP switches to implement only gl and p (i.e., removing gO circuit), thus eight transistors per input bit may be reduced.However, such a synchronous com- parator can provide only one output bit indicating either a > b or a < b (or alternatively a >_ b or a a < b).In order to provide all three comparison results two such comparators may be required.
We can count the number of devices (in terms of pass-transistors) for both asynchronous and syn- chronous comparators as listed in Table III.It is clear that the synchronous schemes have about 30% savings in area, however, asynchronous schemes are more compact for area per output bit.

TABLE Shift
evaluation (Ply)

TABLE III Area
Comparisons of comparators