Addition arithmetic design plays a crucial role in high performance digital systems. The paper proposes a systematic method to
formalize and verify adders in a formal proof assistant Coq. The proposed approach succeeds in formalizing the gate-level implementations and verifying the functional correctness of the most important adders of interest in industry, in a faithful, scalable, and modularized way. The methodology can be extended to other adder architectures as well.
1. Introduction
Demonstrating the functional correctness of an arithmetic implementation is a challenging topic which has lasted for several decades. Testing and simulation, as the traditional methods, have won good reputation and have been employed extensively in industry. When dealing with large scale designs, these methods may find counterexamples but could not assert if a design is correct because the exhaustivity is impractical.
As an alternative, formal methods have been increasingly adopted to validate the arithmetic implementations. A main branch of formal methods is model checking, which is recognised by its automation and succeeds in numerous industrial applications. However, the inherent state explosion problem prevents it from scaling to large scale designs.
Another branch of verification is theorem proving, which is no longer restricted by the scale as model checking, testing, and simulation. The main problem restricting theorem proving to be widespread is that it requires strong logic backgrounds and heavy user interactions. Nevertheless, there appear quite a few successful applications by different theorem provers. By Boyer-Moore, a microprocessor is verified in [1], and an N-bit comparator as well as mean-value circuits are verified in [2]. By HOL, a ripple carry adder and a sequential device are verified in [3], and an ATM switch fabric is verified in [4]. By Coq, a sequential multiplier is verified in [5], and an asynchronous transfer mode switch fabric is verified in [6].
The main effort of this work is to propose a holistic methodology to formalize and verify adders in Coq [7]. Adders are chosen because they are the most fundamental arithmetic units widely employed in various advanced digital systems, such as IBM POWER6, whose correctness depends significantly on the correctness of its addition subcomponents. This methodology provides a uniform way to formalize and verify various implementations of arithmetic addition, and it is applied in this work to formalize and verify primary and high speed adders of interest in industry, including Carry Look-ahead Adder (CLA), Ling Adder (LA), and Parallel Prefix Adder (PPA).
Benefiting from the techniques of Coq, the methodology shares the following decent features.
Scalability: the formalization of an adder is parameterized by a natural number (named length) and the correctness proof applies to any length.
Modularization: various verified adders are encapsulated as instances of an abstract module, which provides a uniform way to be reused in advanced arithmetic units. The formalization and verification of an advanced arithmetic unit can be accumulated from verified units ignoring their detailed implementations.
Fidelity: the adders are formalized by (recursive) functions, which have clear correspondences to the gate-level implementations of circuits. The addends and sum of an adder are formalized as vectors, which is a faithful model of arrays and provides meanwhile additional type checking ability to avoid potential misusing of inputs.
The rest of paper is organized as follows. Related works are introduced in Section 2. According to our knowledge, we verify not only most adders appearing in the literature, but also some for the first time by theorem proving. Section 3 explains our methodology in details by the example of ripple carry adder. Preliminaries are also introduced according to our needs. Some definitions and most proofs will not be presented in this paper, but they are available on the author’s webpage (http://superwalter.github.io/dev/veriadder.zip). Sections 4 and 5 are devoted to LA and PPA, respectively.
2. Related Work
Compared to their extensive applications, the verification of primary adders by theorem proving is not at the fingertips. In particular, the formalization and verification of the Ling adder cannot be found in any literature. Reference [8] proves the correctness of RCA by formalizing adders with dependent types in Coq. Reference [9] proves the correctness of RCA by the higher-order logic with a reusable library for formalizing circuits. Reference [2] verifies RCA written in VHDL as well as other circuits by the higher-order logic. Reference [10] develops semiformal correctness proof of CLA or PPA. Reference [11] shows a pencil-and-paper proof of the general prefix adders, as well as the proof of related RCA. Furthermore, [12] formalizes and verifies these adders in Coq. By rewriting and induction, [13] provides the verification of PPA using powerlists. An algebra formalization of PPA and its correctness proof are presented in [14]. Besides applying it to formalize and verify most primary adders, our methodology also provides good features, which only appear partially in other literatures, but are never integrated together in any preview work, according to our knowledge.
3. A Holistic Methodology
Various kinds of adders are designed to provide relatively good performances for different circumstances, while they implement the same addition functionality. A holistic methodology is proposed in this work in order to capture all the different adders and provide desired good features.
3.1. A Unified Proof Structure
Basically, the methodology answers four questions:
how to formalize the related data types;
which method is used to formalize an adder;
what should be proved;
how to organize formalizations and verifications for different adders.
These questions are answered by a uniform specification, utilizing the module system of Coq.
Definition mbadder (n: nat):=
data (S n)-> data (S n)-> bit->
hyb (S n).
Definition mbadder_c n (f: mbadder n):=
forall (X Y: data (S n)) c,
|[X]| + |[Y]|+ |c|= |(f X Y c)|.
Module Type GenAdder.
Parameter adder: forall n, mbadder n.
Axiom adder_correct: forall (n:nat),
mbadder_correct (@adder n).
End GenAdder.
Lines 1 and 2 answer the first two questions. n, in line 1, is a parameter (name length) indicating the inherent nature of an adder: how many bits it can process. The input carry-in and output carry-out are formalized by Booleans (bit). The input addends and the returned sum are formalized by vectors of Booleans (data m), which are dependent types depending on the length m. hyp m is another dependent type standing for a tuple of a bit and a m-bit vector, which is used in line 2 for combining the carry-out and the sum. Thus, an adder is formalized as a function, taking two addends and a carry-in as inputs and returning a tuple of carry-out and sum. This function is normally recursively defined as shown later.
Lines 3, 4, and 5 answer the third question. The correctness of an adder is ensured by proving that the natural number denotations of the inputs and outputs are equivalent. In line 5, |b| is the natural number denotation of a bit b. |[v]| and |(t)| are natural number denotations of the vector v and the result tuple t. Big endian is chosen to implement these two functions.
Lines 6–10 answer the last question. A general adder is formalized as an abstract module. The specification is assigned and the correctness is required. A verified adder should be its instance, like a Ripple Carry Adder (RCA).
3.2. An Example Explaining the Methodology
Carry Look-ahead Adder (CLA) improves RCA by computing all the carries in advance in order to reduce the significant delay. This is represented, in the formalization, by extending the general module with abstract functions P, G, and carries which are supposed to compute all the propagated carries, generated carries, and carries, respectively, according to the inputs.
Module Type LookAheadAdder <: GenAdder.
Parameter P: forall n, data n -> data
n -> data n.
Parameter G: forall n, data n -> data
n -> data n.
Parameter carries n: data (S n) -> data
(S n) -> bit -> hyb (S n).
Parameter adder: forall n, mbadder n.
Parameter adder_correct: forall n,
mbadder_correct (@adder n).
End LookAheadAdder.
<: symbol in line 1 stands for the fact that this module should be an instance of the general verified adder. RCA is formalized according to the following equations:
(1)ci+1=(xi∧yi)∨((xi⊕yi)∧ci)=gi∨(pi∧ci),(2)si=xi⊕yi⊕ci=pi⊕ci.
Carry to each bit ci+1 in CLA is computed by iteratively unfolding ci in (1) until c0 which is an overall input bit as shown by the following example:
(3)c3=g2∨(p2∧c2)=g2∨(p2∧(g1∨(p1∧c1)))=g2∨(p2∧g1)∨(p2∧p1∧c1)=g2∨(p2∧g1)∨(p2∧p1∧(g0∨(p0∧c0)))=g2∨(p2∧g1)∨(p2∧p1∧g0)∨(p2∧p1∧p0∧c0).
This process as well as definitions of P and G are formalized as follows:
Definition P n (X Y: data n):= X ⊕ Y.
Definition G n (X Y: data n):= X ∧ Y.
Definition carries n (X Y: data (S n))
(cin: bit): hyb (S n).
induction n as [|n rec].
+ exact (bandor (Y ⊳) (X ⊳) cin, [cin]).
+ set (recs:= rec (X ⊲) (Y ⊲)).
exact (bandor (Y ⊳) (X ⊳) (recs1),
(recs1)⋈(recs2)).
Defined.
⊕ and ∧ in lines 1 and 2 and ∨ used later are extensions of logical Boolean operations ⊕, ∧, and ∨, iterating these operations on the elements at the same position of the two vectors. + symbols in lines 5 and 6 stand for the start of the two branches of the recursion where n=0 or n=m+1. The ⊳ operators in line 5 return the leftmost element of a vector. Correspondingly, the ⊲ operator in line 6 returns the rightmost n elements of a (n+1)-bit vector. [b] is a vector with a single bit b. p1 and p2 represent the first and second objects of a tuple, respectively. The ⋈ operator in line 9 joins a bit and a n-bit vector to form a (n+1)-bit vector.
The adder is defined as follows and its correctness is proved by induction on the length and reusing the correctness result of the full adder:
Definition adder: forall n, mbadder n.
intros n X Y cin.
set (cc:= carries (P X Y) (G X Y) cin).
exact (cc1, (P X Y) ⊕ (cc2)).
Defined.
Theorem adder_correct: forall n,
mbadder_correct (@adder n).
Proof. induction n as [|n rec].
… Qed.
3.3. Features Provided by the Methodology
There are several benefits to the use of this methodology for the verification of adders.
3.3.1. Scalability
The formalization and verification of an adder is scalable to any data-width, because the parameterized length can be specified to arbitrary natural number. A 4-bit RCA can be obtained by the following:
Definition CLA4:= CLA 3.
Corollary CLA4_correct: forall
(X Y: data 4) c,
|[X]|+ |[Y]|+ |c|= |(RCA4 X Y c)|.
Proof. intros; apply CLA_correct. Qed.
Notice that a 4-bit CLA is CLA3, because we require that the addends of the adders have at least one bit. The correctness proof of a CLA with a specified length follows straightforwardly from the proof of CLA with arbitrary length.
3.3.2. Modularization
Some high speed adders divide the input addends into different groups. Each group is calculated by a Carry Selected Adder (CSA) independently, and different groups will be concatenated together in order. Since the computation of CSA depends on the very late steps of input carry-in, such designs would have less propagated time, thus high performance. We formalize an abstract architecture for this kind of design, which illustrates the modularization of our method and may also contribute to verify complex adders in the future.
CSA takes an abstract verified adder as parameter and is also an instance of the general verified adder.
Module CSA (M: GenAdder) <: GenAdder.
Definition adder n: mbadder n.
intros X Y c.
set (a1:= M.adder X Y true).
set (a0:= M.adder X Y false).
set (sum:= (dmap (band c) a12) ∨
(dmap (band (¬c)) a02)).
set (c’:= (a11∧ c) ∨ (a01∧ (¬c))).
exact (c’, sum).
Defined.
Theorem adder_c: forall n, badder_
correct (@adder n).
Proof. … rewrite M.adder_c. … Qed.
End CSA.
Module CSA_CLA:= CSA CLA.
Lines 2 to 10 define CSA. Two adders compute the sum and the carry-out with respect to carry-in true and false in lines 4 and 5, respectively. The multiplexer chooses the real sum and carry-out according to the actual carry-in in lines 6 and 7, since when the input carry is required. dmap in line 6 applies a function to each element of a vector. The correctness of CSA holds because the addition unites are correct; thus, CSA is an instance of the general adder. The parameterized module can be instantiated by any verified adders. Line 13 defines a CSA whose addition unites are specified to CLA.
Module Type GroupAdder (M: GenAdder)
<: GenAdder.
Parameter part: list nat.
Fixpoint adder_rec (n lens len: nat):
(mbadder lens).
destruct n.
+ exact (@M.adder lens).
+ specialize adder_rec with (1:=n)
(lens:= pred (cur_index_abr n len))
(2:=len).
….
exact (cast_comb (combination
(@M.adder (lens - (cur_index_abr n
len)))
(adder_rec)) (aux _ _ Hc3 Hc2)).
Defined.
Definition adder n:= adder_rec (sect n)
n n.
Lemma adder_correct: forall n, mbadder_
correct (adder n).
Proof. …Qed.
End GroupAdder.
The formalization and verification of this adder are quite complex due to the problem with the dependent types as described in [15, 16]; therefore, the unimportant details are omitted. The part in line 2 is a partition of the addends. This partition should be valid, which means the elements preserve strict order and do not exceed the total data-width. Lines 3 to 12 define the adder recursively by combining an adder with another which is combination of the remaining groups of adders obtained by recursion. combination in line 9 execute the combining operation. cast_comb in line 9 converts an adder with length m to an adder with length n taking the proof of m=n as an argument. The initial values of this recursive function are specified in line 13. The correctness can be proved by the induction on the length of the partition and using the correctness result of combining correct adders.
The parameterized module can be instantiated by any verified adder. If it is instantiated by CSA, it is a verification of many popular high speed adders.
3.3.3. Fidelity
There are normally two ways to formalize the addends and sum of an adder in Coq, either by dependent type vector as in [6, 8] and this work or nondependent type list as in [12]. Both [6, 8] have explanations why dependent type is more proper for the verification of adders. Generally speaking, nondependent list is more proper for formalizing linked list, whose length can be obtained by computation, while dependent vector is more proper for formalizing array, whose length is inherent natural. The functionality of adders is formalized by interactively defined (recursive) functions which have clear correspondences to gate-level description of circuits.
4. Ling Adder
The Ling Adder (LA) was proposed by [17]. Instead of computing in advance all the carries as CLA, LA computes all the pseudo carries, the propagation of which have less fan-ins and fan-outs. With the proper grouping of the input addends, LA needs lesser levels of gates and consequently has better performance.
Similar to the propagated and generated carries, LA has new complementing signal ki and previous stage propagate Ti, which are defined in (4) and (5) respectively as follows:
(4)ki=ai∧bi,(5)Ti=ai∨bi.
The pseudo carries are defined recursively. According to our knowledge, [17] and other materials about LA define the pseudo carries without considering the case i=0 as this paper does in (6b).
Without this case, the default values of H-1 and T-1 are both false, and it is equivalent to our definition assuming that cin is always false. More intuitively, that algorithm does not consider the carry-in to the least significant bit,, which restricts it to some special applications, such as the addition of two registers. We generalize it to provide general functionality of an adder. Sum is defined similarly to consider the carry-in to the least significant bit as follows:(7a)si=(Hi⊕Ti)∨(ki∧Hi-1∧Ti-1)i>0,(7b)si=(Hi⊕Ti)∨(ki∧cin)i=0.
The abstract module of Ling extends the general one by adding signatures of k, T, and H.
Module Type LingAdder <: GenAdder.
Parameter K: forall n, data n -> data
n -> data n.
Parameter T: forall n, data n -> data
n -> data n.
Parameter H: forall n, data (S n)-> data
(S n)-> bit-> data (S n).
Parameter adder: forall n, mbadder n.
Parameter adder_correct: forall n,
mbadder_correct (@adder n).
End LingAdder.
To compute the ith pseudo carry of H, the ith bit of K and the (i-1)th bit of T are needed. Therefore, the two parameters of H stand for vectors K and a left shift of T. The formalization of H assuming the correctness of the parameters is as follows:
Definition H n (X Y: data (S n)):
data (S n).
induction n as [|n rec].
+ exact ([(X ⊳) ∨ (Y ⊳)]).
+ set (recs:= rec (X ⊲) (Y ⊲)).
exact ((X ⊳) ∨ ((Y ⊳) ∧ (recs ⊳))
⋈ recs).
Defined.
H is defined recursively. Line 3 deals with the case i=0. Lines 4 and 5 deal with the recursive case. recs is the last n bits of H by recursion, and recs⊳ stands for Hn-1.
LA is defined according to (7a) and (7b) using the definition of H.
Definition adder n
(X Y: data (S n)) (cin: bit): hyb (S n).
set (KXY:= K X Y).
set (TXY:= T X Y).
set (Tshft:= shiftin cin TXY).
set (Hc:= H KXY (Tshft ⊲)).
set (Hcshft:= shiftin true pc).
set (sum:= (TXY ⊕ Hc) ∨ (KXY ∧
(Hcshft ⊲) ∧ (Tshft ⊲))).
exact ((TXY ⊳) ∧ (Hc ⊳), sum).
Defined.
Since the ith bit of sum depends on the (i-1)th bit of H and T, they are shifted in lines 5 and 7. The reason why cin is shifted into T is explained above; true is shifted into H to ensure T-1∧H-1=cin where H-1 and T-1 are the bits to be shifted in, respectively, and T-1=cin. The carry-out of LA is (TXY⊳)∧(Hc⊳) which is equivalent to cout as shown in
(8)ci=Hi-1∧Ti-1,i≥0.
The formalization of (8) is complicated, but the proof is trivial by induction and case analysis. The correctness of LA follows by proving a lemma stating that the outputs of CLA and LA are the same with regard to arbitrary same inputs. This lemma is proved by induction with the result of (8).
Lemma LA_CLA_eq: forall n (X Y: data
(S n)) c_in,
LAdder.adder X Y c_in = CLAdder.adder
X Y c_in.
Proof. induction n as [|n rec].… Qed.
Theorem adder_correct: forall n
(X Y: data (S n)) c,
|[X]|+ |[Y]|+ |c|=
|(LAdder·adder X Y c)|.
Proof. intros; rewrite LA_CLA_eq.
apply CLAdder.adder_correct. Qed.
Reference [18] proposed an extension of Ling’s adder by the following equations:
(9)Dj:k=Gj:k+Pj:k=Gj:k+1+Pj:k,(10)Bj:k=gj+g+j-1+⋯+gk,(11)Gj:i=Dj:k∧(Bj:k+Gk-1:i),
where Gi:j and Pi:j are group propagated and generated carries which are defined later in Section 5. Equation (11) is also proved in this work.
5. Parallel Prefix Adder
CLA improves RCA by computing all the carries in advance as shown in (4). However, large fan-in and fan-out will be caused if all the carries ci are computed this way especially when i is large. Parallel Prefix Adder (PPA) avoids this by the idea of divide-and-conquer, which provides an efficient way to compute all the parallel carries. Basic definitions are as follows:
(12)ci+1=Gi:j∨(Pi:j∧cj)(j≤i),(13)si=ci⊕Pi,(14)Pi:j={Pii=jPi∧Pi-1:ji>j,(15)Gi:j={Gii=jGi∨(Pi∧Gi-1:j)i>j.
Due to the similarity between (14) and (15), only the formalization of (15) is shown as follows. An auxiliary function, defined recursively on the difference of i and j, is reluctantly introduced to define it in Coq.
Definition GpG_rec n (gp gg: data (S n))
(d i: nat): bit.
revert i; induction d as [|d rec];
intros i.
+ exact (nth (n-i) gg).
+ exact ((nth (n-i) gg) ∨
((nth (n-i) gp) ∧ (rec (pred i)))).
Defined.
Definition GpG n (X Y: data (S n)) i j:=
GpG_rec X Y (i-j) i.
In line 1, the parameters gp and gg stand for the propagated and generated carry vectors. Another parameter d is the difference of i and j. Function nthkv returns the kth element of v from the leftmost bit indexed 0. pred n computes the predecessor of n.
To compute all the carries parallel in advance, the carry ci+1 should not depend on any ck, where i≥k>0, except c0 which is the overall carry-in. Therefore, carries of PPA are computed according to a variation of (12) as follows:
(16)ci+1=Gi:0∨(Pi:0∧c0),
and different PPAs employ different parallel prefix methods to compute the group carries Gi:0 and Pi:0, for all n≥i≥0, for the sake of high performance. To capture various PPAs in a uniform framework, an abstract module, which abstractly describes this method as groups, is employed as follows:
Module Type GroupCarries.
Parameter groups: forall n, data2
(S n) -> data2 (S n).
Axiom groups_correct: forall
n (X Y: data (S n)),
groups (P X Y, G X Y) =
correct_groups (P X Y, G X Y).
End GroupCarries.
data2n, in line 3, is the dependent type of a tuple of vectors whose lengths are both n. Therefore, the parameter of groups stands for vectors of propagated and generated carries as shown in line 6. groups_correct in line 4 is the assumption that the groups function is correct. The correctness is represented as an extensional equality of another correct function and itself. In line 7, correct_groups is the correct function to compute the groups carries according to (14) and (15). Its correctness holds by, first, computing all the carries correct_carries according to this function and then proving that correct_carries are equivalent to the carries of CLA.
Definition correct_carries (n: nat)
(c_in: bool)
(X Y: data (S n)): hyb (S n).
set (PXY:= P X Y).
set (GXY:= G X Y).
set (bvGp:= correct_groups (PXY, GXY)).
set (all_c:= shift_map c_in bvGp).
exact (all_c ⊳, all_c ⊲).
Defined.
Lemma carries_correct: forall n
(X Y: data (S n)) c_in,
correct_carries c_in X Y =
CLAdder.carries (P X Y) (G X Y) c_in.
shift_map, in line 2, is a compositional operation first iterating Equation (16) on all the Gi:0 and Pi:0 which are stored in the vectors of the first and projection of bvGp and then shifting the overall carry-in c0 to get all the carries. Consider that the computation of Gi:j depends on the subgroups of the group propagated carries Pi:m, the fundamental carry operator “∘” as in [19] is used to compute the group propagated and generated carries simultaneously in function correct_groups and should be used in all implementations of function groups. Consider
(17)(P,G)∘(P′,G′)=(P∧P′,G∨(P∧G′)).
Function correct_groups can be taken as an instance of groups function and is only one particular implementation of groups, which is verified. There are many other implementations of the groups function based on the following lemmas which are proved by induction on the difference between i and m, using (14) and (15):
(18)Pi:j=Pi:m∧Pm-1:j(j<m≤i),Gi:j=Gi:m∨(Pi:m∧Gm-1:j)(j<m≤i).
Equation (18) can be rewritten using ∘ operator in one equation. For all j<m≤i,
(19)(Pi:j,Gi:j)=(Pi:m,Pm-1:j)∘(Gi:m,Gm-1:j).
Equation (19) shows clearly that any group of group carries can be computed by its concatenation (or even overlapped) subgroups. And the proper dividing and conquering of the bits of input addends can implement groups function with high performance. PPA is such a family of adders differing only in the computation of the groups function; thus, a general PPA can be formalized and parameterized by module GroupCarries.
Module PPAdder (Import M: GroupCarries)
<: GenAdder.
Definition adder n (X Y: data (S n))
(c_in: bit): (hyb (S n)).
set (GT0:= groups ((P X Y), (G X Y)).
set (all_carries:= shift_map c_in (GT01)
(GT02)).
set (sum:= PC ⊕ (all_carries ⊲)).
exact (all_carries ⊳, sum).
Defined.
Theorem adder_correct: forall n
(X Y: data (S n)) c_in,
|[X]|+ |[Y]|+ |c_in|= |(adder X Y c_in)|.
Proof.
intros n X Y c_in; unfold adder.
rewrite CLAdder.adder_correct.
unfold CLAdder.adder.
rewrite <- carries_correct.
unfold correct_carries.
rewrite groups_correct; trivial.
Qed.
End PPAdder.
Line 5, uses the abstract function groups from the parameterized module GroupCarries to compute all the group carries in advance. shift_map function in line 7 implements the operation in (16). Lines 6 and 8 compute all the carries and the sum.
The correctness of PPA is proved based on the assumption groups_correct in the abstract parameterized module GroupCarries. Line 15 reformats the left part of the equation to the result of what CLA computes. Line 17 uses the result that the carries of CLA are identical to the carries computed by (14), (15), and (16). Finally, the assumption groups_correct is used to prove that groups computes the same result as (14) and (15) do.
The rest of this section will show, by the example of Kogge-Stone addition algorithm, how this general PPA applies to some specific ones. The algorithm formalized following [20], in which the algorithm is proposed. Other implementations of PPA can be formalized and verified similarly.
Module Kogge_Stone <: GroupCarries.
Fixpoint KS_PG_rec (n m: nat)
(bvPG: data2 (S n)): (data2 (S n)):=
match m with
|0 => bvPG
|S m’ => let recur:= (KS_PG_rec m’ bvPG)
in
data2_op1 recur (shiftin_group
(power2 m’)recur)
end.
Definition groups n (PG: data2 (S n)):=
KS_PG_rec (S (log2 (S n))) PG.
Theorem groups_correct: forall n
(X Y: data (S n)),
groups (P X Y, G X Y) = correct_groups
(P X Y, G X Y).
End Kogge_Stone_Group_Carry.
Lines 2 to 10 describe the main function to define the Kogge-Stone implementation of the groups function. m is a simple counter to indicate how many stages are needed and which should the logarithm of the data-width be. When initializing, the input bvPG stands for two vectors of the propagated and generated carries, respectively, for example, Pi=Pi:i and Gi=Gi:i, for all n≥i≥0. At any stage m, this function computes the group carries of maximum length 2m. A 8-bit kogge-stone adder is taken as an example to illustrate this procedure. At stage 2 (m′=2), the group carries of maximum length 4 has been computed according to line 8:
(20)recur:=([P7:4,P6:3,P5:2,P4:1,P3:0,P2:0,P1:0,P0],[G7:4,G6:3,G5:2,G4:1,G3:0,G2:0,G1:0,G0]).
At the next stage (m=3), as in lines 8 and 9, firstly shiftin_group function shifts both vectors in the tuple simultaneously 2m′ times with true and false, respectively. The result is represented by recur′:
(21)recur′:=([P3:0,P2:0,P1:0,P0,true,true,true,true],[G3:0,G2:0,G1:0,G0,false,false,false,false]).
Secondly, date2_op1 executes the fundamental operator in (17) with two operands recur and recur′, and the result is
(22)([P7:0,P6:0,P5:0,P4:0,P3:0,P2:0,P1:0,P0],[G7:0,G6:0,G5:0,G4:0,G3:0,G2:0,G1:0,G0]).
In line 11, groups function specifies that the stages needed are log2(n+1), where n is the data-width.
The correctness theorem cannot be proved by induction on the data-width as normal, because Kogge-Stone implementation of groups function recurses on the stages, not the data-width as shown in the definition of KS_PG_rec.
Noticing that, in the theorem, the result of each side of equation is a tuple of vectors, the equality holds if and only if the corresponding elements are identical pairwise.
Lemma data_eq_nth_eq_data2: forall n
(gx gy: (data2 (S n))),
(forall k, k < S n ->
(nth k (fst gx), nth k (snd gx)) =
(nth k (fst gy), nth k (snd gy))) <->
gx = gy.
However, the result of KS_PG_rec changes with the stage m, the first thing to prove is an invariant of m stating how this function approaches the result of correct_group function gradually with the increasing of the stages. Suppose, without loss of generality, that
(23)correct_groups(X,Y):=([Pn:0,…,P0],[Gn:0,…,G0]),KS_PG_rec(m,Z):=([Pn:0m,…,P0m],[Gn:0m,…,G0m]),
for all m, Z=(X,Y) and n>0; then, for all 0≤k≤n,
(24)(Pk:0m,Gk:0m)={(Pk:0,Gk:0)k<2m(Pk:(k+1-2m),Gk:(k+1-2m))k≥2m.
With this invariant, the existence of the fixed points can be proved secondly, and the least fixed point should be log2(n+1). For all m≥log2(n+1)and k≤n<2m, (Pk:0m,Gk:0m)=(Pk:0m+1,Gk:0m+1)=(Pk:0,Gk:0). Function groups, which iterates KS_PG_rec function log2(n+1) stages, computes the same result as function correct_groups, which is the correctness theorem. The whole proof of this theorem has been carried out in COQ, although they are expressed in an intuitive way here for better understandings of the readers.
Kogge-Stone adder can be combined by the general module of PPA and this specific module of Kogge-Stone methods to compute all the group carries, which provides not only the computation method but also the correctness proof.
Module Kogge_Stone <: GenAdder:=
PPAdder Kogge_Stone.
6. Conclusion and Future Work
In this work, we proposed a holistic methodology to formalize and verify primary adders (RCA, CLA, LA, and PPA) in theorem prover Coq. They are formalized using dependent types, higher-order recursion and module systems in order to provide fidelity, scalability, and modularization.
In particular, PPA is a family of adders sharing the same structure, only differing in the methods of parallel prefix computing. We provide a novel way to describe the general PPA and show how to use this general module to verify a specific PPA, exemplified by Kogge-Stone adder.
Other advanced arithmetic designs can be verified reusing the formalizations and verifications of this work in a combinational way, as we describe by the example of carry select adders.
All the work has been carried out in Coq. The whole development contains around 2,000 lines of Coq scripts. This number of scripts is only about one third of [12], which is another work dedicated to verify additional designs in Coq. This work used lesser scripts but verified more addition designs than [12].
This work can be continued in two directions. Advanced arithmetic designs, such as IBM POWER6, can be cumulately verified based on these verified adders. Since formalization in a constructive way is to have clear correspondence to gate-level descriptions of circuits, HDL codes can be generated from the verified designs, which may provide an alternative way for designing the correct arithmetic implementations.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work has been supported by the National Science Foundation of China Grant 61272002, the Tsinghua National Laboratory for Information Science and Technology (TNList) Cross-discipline Foundation 2011-9, the Major Research plan of the National Natural Science Foundation of China Grant 91218302, and the National Basic Research Program of China (973 Program) Grant 2010CB328003.
HuntW. A.Jr.BrockB. C.LeeserM.BrownG.The verification of a bit-slice ALU1990408New York, NY, USASpringer282306Lecture Notes in Computer Science10.1007/0-387-97226-9_34BorrioneD.PierreL.SalemA.Formal verification of VHDL descriptions in the prevail environment199292425610.1109/54.143145CamilleriA.GordonM.MelhamT.1986Cambridge, UKComputer Laboratory, University of CambridgeCurzonP.Experiences formally verifying a network componentProceedings of the 9th Annual Conference on Computer Assurance (COMPASS '94)July 1994Gaithersburg, Md, USA183193Safety, Reliability, Fault Tolerance, Concurrency
and Real Time, Security2-s2.0-002857633210.1109/CMPASS.1994.318453Paulin-MohringC.BerardiS.CoppoM.Circuits as streams in Coq: verification of a sequential multiplier19961158Berlin, GermanySpringer216230Lecture Notes in Computer Science10.1007/3-540-61780-9_72MR1474541Coupet-GrimalS.JakubiecL.Certifying circuits in type theory20041643523732-s2.0-1064426522710.1007/s00165-004-0048-3The Coq Development TeamThe Coq proof assistant, reference manual2012version 8.4Roquencourt, FranceINRIABraibantT.JouannaudJ.-P.ShaoZ.Coquet: a coq library for verifying hardware20117086Berlin, GermanySpringer330345Lecture Notes in Computer Science10.1007/978-3-642-25379-9_24MilneG. J.1993New York, NY, USAMcGraw-HillO'DonnellJ. T.RüngerG.Functional pearl derivation of a logarithmic time carry lookahead addition circuit20041466977132-s2.0-974426327110.1017/S0956796804005180LiuF.TanQ.ChenG.Formal proof of prefix adders2010521-219119910.1016/j.mcm.2010.02.008MR2645930ChenG.Formalization of a parameterized parallel adder within the Coq theorem prover20102911491532-s2.0-7324911470310.1109/TCAD.2009.2034346KapurD.SubramaniamM.Mechanical verification of adder circuits using rewrite rule laboratory19981321271582-s2.0-003216544610.1023/A:1008610818519HinzeR.KozenD.An algebra of scans20043125Berlin, GermanySpringer186210Lecture Notes in Computer Science10.1007/978-3-540-27764-4_11MR2163419BarrasB.JouannaudJ. P.StrubP. Y.WangQ.CoQMTU: a higher-order type theory with a predicative hierarchy of universes parametrized by a decidable first-order theoryProceedings of the 26th Annual IEEE Symposium on Logic in Computer Science (LICS '11)2011Ontario, Canada14315110.1109/LICS.2011.37WangQ.BarrasB.RoccaS. R. D.Semantics of intensional type theory extended with decidable equational theories201323Dagstuhl, GermanySchloss Dagstuhl—Leibniz-Zentrum fuer Informatik653667Leibniz International Proceedings in Informatics (LIPIcs)LingH.High-speed binary adder198125315616610.1147/rd.252.0156JacksonR.TalwarS.High speed binary addition2Proceedings of the Conference Record of the 38th Asilomar Conference on Signals, Systems and ComputersNovember 2004Asilomar, Calif, USA135013532-s2.0-21644432589KorenI.2002Hyderabad, IndiaUniversities PressKoggeP. M.StoneH. S.A parallel algorithm for the efficient solution of a general class of recurrence equations197328786793MR039530710.1109/TC.1973.5009159