Self Checking Design Technique for Numerical Computations

The objective of this paper is to develop an efficient method for testing of numerical computations based on algebraic concepts such as transcendental degree of field extensions. A class of polynomially checkable functions is introduced, and for computation of the functions from this class a new method for error detection/error correction is proposed. This class offunctions is shown to be large. The proposed method can also be extended to testing of computations of functions which are not polynomially checkable. The preliminary results show great potential of this approach. In particular the proposed approach will lead to substantial reduction in hardware overhead required for multiple error detection and correction, as compare to the check sum method and other existing techniques.


INTRODUCTION
Computation of numerical functions is a very common and widely used type of computing procedures. Testing of numerical computations, is, therefore, an important problem which up to now has not been satisfactorily solved. By testing we mean here detection and, if possible, correction of errors resulting from different causes: software faults (bugs in the program), hardware design faults, or faults caused by technological imperfections, random failures, etc.
As far as we are aware of, there are four different approaches to error detection/correction in numerical computation and computer memories. The first of them treats all the data obtained by computation or stored in the memory as indepen-dent. Then the error control capability can be achieved by introducing a substantial space and/or time redundancy. This redundancy grows linearly with the size of the data array (for a fixed error density).
Typical methods of this kind are: 1) Replication with voting. This method, though the simplest one for implementation, require an extremely large redundancy Ill 2) Methods based on error-correcting codes (Hamming codes, Chinese remainder method, Check sum tests) (e.g. [2][3][4]). For the method developed in [2], a typical example requires the storage-space overhead of 40%.
The second approach makes use of the specific properties of the function to be computed. Since 385 386 F.S. VAINSTEIN this approach exploits the "hidden redundancy" of the values of the function itself, it, in principle, requires much smaller redundancy. The most advanced of previously known methods of this kind is liner checks for polynomials [5][6][7]. It is based on the properties of linear error-correcting codes. However, this method is limited to polynomial functions only, and is difficult for on-line implementation.
3) Methods based on the probabilistic concepts [1'11]. They deal primarily with faults in software systems. 4) Interval arithmetic methods [12][13][14]. They deal primarily with the errors introduced by the rounding in computations. It usually takes about as much computation as evaluation the function twice. This approach requires unacceptably high hardware redundancy (about 200%).
The proposed method which was first published in [9], is different from those mentioned above. It is based on certain algebraic concepts such as transcendental degree of field extension, and employs the specific structure of the function to be computed. This method has important advantages, especially for the case of random errors: it requires small hardware redundancy (typically less than 5%), provides good fault coverage, and has very good fault location capability. It can be applied to check the computation of a very broad class of functions. It should be taken into consideration that computations are done in practice with a certain degree level of accuracy. Hence the formula (2) should be substituted by the formula ]Aj + BJ + CJ < 6, (2') where 6 is a small positive number specified by the precision of the computation.

Hardware Implementation
A general block diagram for the implementation of the proposed technique for our example is shown in Fig. 1 [8]. The numbers x/ and x/ are algebraically dependent over Q. P(T, T2)= T12 + T-5. The numbers 1, 7r R are algebraically independent over Q. DEFINITION 2 Let K C L be a field extension. Transcendental degree (Tr. deg.) of this field extension is by definition the maximum possible number of elements from L algebraically independent over K.
If Tr. deg. of K cL is equal to n and m>n, then any subset {al,...,am} c L is algebraically dependent.  (5'). The first class of faults (we can call them software faults) are the result of the fact that some other PC function g(x)f(x) can have the same checking polynomial. For instance if g(x) f(x / b), where b is a constant, then g(x) and f(x) have the same checking polynomials. Preliminary results, however, show the class of functions having the same checking polynomial is not big, which make it possible to fight the software faults. We are going to address this issue in the further publications.
The second class of faults which can not be detected by using (5'  2) For every a R, f(x + a) R(x, e x, Sinx). R(x, e x, Sinx). But the Tr. deg. of R c R(x, Sinx, e x) equals to 3 and therefore fo,..., f3 are algebraically dependent.

Corollary
Letfbe a result of application of a finite number of additions, subtractions, multiplication, divisions and raising to a rational power to the following functions: Const, ex, Sin(fix + bi), Cos(rjx + bj), where ri, rj are rational numbers. Then f is a PC function with k < 3.
Tr. deg. of extension R c R(Sinx) equals to 1, therefore f(x) is a PC function with k 1. Note The theorem 2 states that the class of PC functions is very big. We have to note, however, that the number of commonly used functions like ln(x), Sin-l(x), Cos-(x) are not PC functions.

How to Find a Checking Polynomial
Suppose we are given a function f" R R. How to find a checking polynomial P(T0,..., Tk) for it? Indeed e(f(xi),f(xi + al),..., f(xi + ak)) for every xi E R. Hence this equality can be considered as a linear equation for the unknown coefficients of the polynomial P(To,..., Tk) for every xi R.
We can form a sufficient number of linear equations by choosing different xi R. Then the coefficients of the checking polynomial are found as the solution of this system.  For b= 32 the portion of undetected faults is smaller or equal than 6.6.10-1.

HARDWARE IMPLEMENTATION OF THE CHECKING POLYNOMIAL TECHNIQUE
The theorem is proved.

APPLICATIONS
The proposed method of error detection/correction can be effectively used for on-line computation of numerical functions. It can be used for off-line acceptance-testing of programs in the course of development, or of devices in the course of manufacturing. For example, this approach can be used for testing a ROM containing the value f(x) in cell whose address is x. In the case of testing a RAM we first have to choosef(x), and then write in every cell the value f(x) corresponding to its address x. After this we scan out the memory, verify the checks (according to inequality 5') and detect or locate errors by analyzing the results of these checks. This approach is applicable to stuck-at faults in ROM and RAM, stuck-at faults at the outputs of an address decoder, bridging faults between output 392 F.S. VAINSTEIN lines of the decoder and faults that affect power supply or read/write circuits. An important feature of the method is that the required hardware and software overhead is limited to that necessary for computing the values of the checking polynomial. This computation can be performed by the same hardware system without any overhead. The method looks especially advantageous for the case when function values to be computed or stored has a large number of argument values,. In case of on-line testing the time redundancy required for a error detection is lO0/k.tl/t2%, where tl and t2 are the time intervals required for computing a single value of the checking polynomial and the function respectively, while error location needs 100. tl/t2% time overhead. Moreover, when an error is located, it can be corrected by computing the root of the polynomial of one variable obtained from the checking polynomial by substituting J for Ti for those values of where computation is correct. This technique is especially simple and convenient when the checking polynomial is linear with respect to any variable Ti, in particular, when the polynomial is of degree (see section 1.2).
The checking polynomial approach is a highlevel functional technique which does not depend on the implementation of the program or the device computing the function f(x).