Under Quantum Computer Attack: Is Rainbow a Replacement of RSA and Elliptic Curves on Hardware?

Among cryptographic systems, multivariate signature is one of the most popular candidates since it has the potential to resist quantum computer attacks. Rainbow belongs to the multivariate signature, which can be viewed as a multilayer unbalanced OilVinegar system. In this paper, we present techniques to exploit Rainbow signature on hardwaremeeting the requirements of efficient high-performance applications.We propose a general architecture for efficient hardware implementations of Rainbow and enhance our design in three directions. First, we present a fast inversion based on binary trees. Second, we present an efficient multiplication based on compact construction in composite fields.Third, we present a parallel solving system of linear equations based on GaussJordan elimination. Via further other minor optimizations and by integrating the major improvement above, we implement our design in composite fields on standard cell CMOS Application Specific Integrated Circuits (ASICs). The experimental results show that our implementation takes 4.9 us and 242 clock cycles to generate a Rainbow signature with the frequency of 50MHz. Comparison results show that our design is more efficient than the RSA and ECC implementations.


Introduction
The idea of public key cryptography was introduced by Diffie and Hellman.Their method for key exchange came to be known as Diffie-Hellman key exchange [1].This was the first published practical method for establishing a shared secret key over an authenticated communications channel without using a prior shared secret.Then a public key cryptographic scheme was invented by Rivest et al. [2].This scheme came to be known as RSA, from their initials.RSA uses exponentiation modulo, a product of two very large primes, to encrypt and decrypt, performing both public key encryption and public key digital signature.The introduction of elliptic curve cryptography by Koblitz [3] and Miller [4] in the mid-1980s has yielded new public key algorithms based on the discrete logarithm problem.Elliptic curves provide smaller key sizes and faster operations for approximately equivalent estimated security.Since then, various schemes of encryption and signature generation have been developed in the field of public key cryptography.Efficient implementations of these schemes have played a crucial role in numerous real-world security applications, such as confidentiality, authentication, integrity, and nonrepudiation.Since software implementations even on multicore processors can often not provide the performance level needed, hardware implementations are thus the only option, which appear to be a promising solution to inherent performance issues of public key cryptographic systems and provide greater resistance to tampering.Among hardware implementations of public key cryptographic systems, RSA and elliptic curves systems are the most widely adopted candidates [5][6][7][8][9][10][11][12][13][14].Their security lies in the difficulty of factorizing large integers and the discrete logarithm problem, respectively.Shor algorithm was invented by Shor which could solve the problems of the prime factors of large numbers and elliptic curve discrete logarithm in polynomial time [15].Such cryptographic schemes have potential weakness under quantum computer attacks.
Multivariate cryptography is one of the most popular postquantum cryptography since it has the potential to resist quantum computer attacks [16].The main strength of multivariate cryptography is that its underlying mathematical problem is to solve a set of Multivariate Quadratic (MQ) polynomial equations in a finite field, which is proven to be an NP-hard problem [17].During the past thirty years, various multivariate cryptographic schemes have been proposed, like Unbalanced Oil-Vinegar Signature (UOV) [18], Rainbow [19,20], Tame Transformation Signature (TTS) [21,22], and others [23][24][25].Their implementations have been one of the subjects of a lot of researches and continue to be a topic of interest in many areas, for example, efficient multivariate systems on Field Programmable Gate Arrays (FPGAs) [26], small multivariate processors on FPGAs [27], high speed Rainbow on FPGAs [28], and minimized multivariate PKC on Application Specific Integrated Circuits (ASICs) [29].
Among the existing multivariate cryptographic schemes, Rainbow belongs to Oil-Vinegar family, which can be viewed as a multilayer unbalanced Oil-Vinegar system.Compared with RSA and elliptic curves, the security of Rainbow is based on solving a set of MQ polynomial equations, which has the potential to resist quantum computer attacks.
Our Contributions.In this paper, we present techniques to exploit Rainbow signature on hardware meeting the requirements of efficient high-performance applications.We propose a general architecture for efficient hardware implementations of Rainbow and enhance our design in three directions.First, we present a fast inversion in ((2 4 ) 2 ) based on binary trees, which is the extension of the work in [30].Second, we present an efficient multiplication in ((2 4 ) 2 ) based on compact construction, which is the extension of the work in [27].Third, we present a parallel solving system of linear equations in ((2 4 ) 2 ) based on Gauss-Jordan elimination, which is based on the work in [28].Via further other minor optimizations and by integrating the major improvement above, our design is implemented on ASICs and provides significant reductions in time-area product.The comparisons with other public key cryptographic systems show that Rainbow has a good performance on hardware and is a better candidate than RSA and elliptic curves under quantum computer attacks.
Moreover, our design can be generalized with minor modifications that also support FPGAs.Besides, Rainbow implementations on hardware must be protected against a wide range of attacks, including side channel attacks.Side channel attack belongs to physical attack, which is any attack based on information gained from the physical implementation of cryptographic systems, rather than brute force or theoretical weaknesses in cryptographic algorithms.Therefore, we discuss defending against a possible differential power analysis for Rainbow and we present countermeasures against fault analysis and differential power analysis attack.
Organization.The rest of this paper is organized as follows: Section 2 introduces Rainbow signature schemes.Section 3 presents building blocks for Rainbow schemes.Section 4 presents efficient implementations of Rainbow on ASICs.Section 5 compares our design with other public key cryptographic systems.Section 6 discusses defending against a possible differential power analysis for Rainbow.Section 7 summarizes our design.

Preliminary
Among multivariate signatures, Rainbow belongs to Oil-Vinegar family, which can be viewed as a multilayer unbalanced Oil-Vinegar system.The construction of Rainbow includes affine transformation  1 , central map transformation , and affine transformation  2 ; that is, The hash value of the message of Rainbow is ( 0 ,  1 , . . .,  −1 ) and its size is , where  0 ,  1 , . . .,  −1 are elements in a finite field.We also suppose that the signature is ( 0 ,  1 , . . .,  −1 ) and its size is , where  0 ,  1 , . . .,  −1 are elements in a finite field.The private keys of Rainbow are  1 , , and  2 .
Among the existing Rainbow schemes, Rainbow (17,13,13) is commonly believed to provide a security level of 2 80 [20], which works with 17 first-layer Vinegar variables and 13 first-layer and 13 second-layer Oil variables in (256).This scheme is depicted in Table 1 and is introduced as follows.
In order to sign a message, we need to solve the equation To do this, we first solve −1 1 is an affine transformation: where  is a matrix with the size of 26 × 26 and  is a vector with the size of 26. and  are parts of private keys.Second, we solve where the construction depends on a map is a two-layer construction; namely, ( 0 ,  1 , . . .,  25 ) are divided into two layers: 1 :   |  = 13, 14, . . ., 25.
Similarly, ( 0 ,   MQ polynomials  are defined by where   ,   /  are Oil and Vinegar variables on this layer and the coefficients   ,   ,   ,   , and  are parts of private keys.
Last, we solve where  is a matrix with the size of 43 × 43 and  is a vector with the size of 43. and  are parts of private keys.
Then  is the signature of .

Building Blocks for Rainbow Schemes
Considering Section 2, we see that, in order to generate a Rainbow signature, the following operations are required: (1) Computing affine transformations, that is,  = +, where  is a matrix and  is a vector (2) Computing central map transformation, that is, evaluating multivariate polynomials and solving systems of linear equations.
Computing these operations requires multiplications, inversions, and solving systems of linear equations in a finite field, which are presented in the following.

A Fast Inversion Based on Binary
Trees.We suppose that () =  ℎ  +   and () =  ℎ  +   are the elements in ((2 4 ) 2 ) and () is the inverse of (), where the subfield is (2 4 ) and  ℎ ,   ,  ℎ , and   are elements in (2 4 ).The irreducible polynomials in ((2 4 ) 2 ) are () =  2 +  + 9. Then the inversion is computed as follows: We adopt a pipelined architecture in (2 4 ), which is the extension of the work in [30].We use two binary trees for computing squares and inversions in (2 4 ), which are illustrated as follows: (1) Each binary tree has four layers; root nodes are on the third layer.
(2) Each node has at most two child nodes, left node represents value of zero, and right node represents value of one.
(3) Each child must either be a leaf or be the root of another tree; each node has a father node when it is not a root node.
(4) Each element in a finite field has a unique traversal from root to leaf.
(5) Each leaf (most) is linked to another leaf.
Figure 1 is the architecture based on binary trees for computing squares and inversions in (2 4 ).We use two architectures in our design, that is, square-trees for squares and inversion-trees for inversions.
Square-trees: we suppose that traversal from root ( 0 ) to leaf ( 3 ) includes tree nodes  0 ,  1 ,  2 , and  3 , which represents the element () in (2 4 ).If traversal from root ( 4 ) to leaf ( 7 ) represents the element () in (2 4 ), which is the square of (), then  3 is linked to  7 .When we are required to compute the square of (), it is very convenient to find its square via traversing the square-trees.
Inversion-trees: we suppose that traversal from root ( 0 ) to leaf ( 3 ) includes tree nodes  0 ,  1 ,  2 , and  3 , which represents the element () in (2 4 ).If traversal from root ( 4 ) to leaf ( 7 ) represents the element () in (2 4 ), which is the inverse of (), then  3 is linked to  7 .When we are required to compute the inverse of (), it is very convenient to find its inverse via traversing the inversion-trees.
Since square-trees and inversion-trees have four layers, we can use them to compute squares and inversions with pipelining.The computation of () = () −1 is presented as follows: (1) Via using square-trees, we can compute  ℎ 2 and   2 with pipelining.

An Efficient Multiplication
Then the multiplication is computed as follows: By substituting () =  2 +  + 9 into (12), we have The computations of  ℎ and   use a compact construction, which is the extension of the work in [27].
We adapt four s, three -s, and ℎ, where  and - compute additions and multiplications in (2 4 ), respectively.
0 and 1 are used to compute respectively.0, 1, and -2 are used to compute respectively.
ℎ is used to compute a right shift and a bit addition: 2 and 3 are used to compute respectively.The multiplication has been computed.

A Parallel Solving System of Linear Equations Based on
Gauss-Jordan Eliminations.We propose a parallel solving system of linear equations based on Gauss-Jordan eliminations, which is the extension of the work in [28].We give a straightforward description of the proposed algorithm of the parallel variant of Gauss-Jordan elimination in Algorithm 1, where () stands for operation performed in the th iteration, and  = 0, 1, . . ., 12.The optimized Gauss-Jordan elimination with 13 iterations consists of pivoting, inversion, normalization, and elimination in each iteration.We enhance the algorithm in four directions.First, multiplication is computed by invoking efficient multipliers designed in Section 3.2.Second, we adopt fast inverter described in Section 3.1.Third, inversion, normalization, and elimination are designed to perform simultaneously.Fourth, during the elimination in the th iteration, we simultaneously choose the right pivot for the next iteration; namely, if  Algorithm 1: Solving a system of linear equations  =  with 13 iterations, where  is a 13 × 13 matrix.element  +1,+1 of the next iteration is zero, we swap the ( + 1)th row with another th row with the nonzero element   , where ,  = 0, 1, . . ., 12.The difference from usual Gauss-Jordan elimination is that the usual Gauss-Jordan elimination chooses the pivot after the elimination, while we perform the pivoting during the elimination.In other words, at the end of each iteration, by judging the computational results in this iteration, we can decide the right pivoting for the next iteration.By integrating these optimizations, it takes only one clock cycle to perform one iteration.The architecture for solving systems of linear equations in ((2 4 ) 2 ) is depicted in Figure 2 with matrix size 13 × 13.There exist three kinds of cells in the architecture, namely, ,   , and   , where  = 1, 2, . . ., 12 and  = 1, 2, . . ., 13.The  cell is for fast inversion.As described in Section 3.1, two binary trees are included in the  cell for computed inversion.The   cells are for normalization.And the   cells are for elimination.The architecture consists of one  cell, 13   cells, and 156   cells.
(3) Solve the first systems of linear equations with matrix size 13 × 13 of central map transformation .
(5) Solve the second systems of linear equations with matrix size 13 × 13 of central map transformation .
(6) Compute the second affine transformation  2 via invoking matrix-vector multiplication and vector addition.
In order to prove that the designs of Rainbow(17, 13, 13) are efficient on hardware, Hardware Description Language (Verilog HDL) code for modeling the designs has been implemented on ASICs.We implement our design in ((2 4 ) 2 ) on TSMC-0.18m standard cell CMOS ASICs.We use Synopsys Design Vision, which is a GUI for Synopsys Design Compiler tools.The map effort is set to medium.We present the experimental results in Tables 2 and 3, which are extracted after place and route.
Tables 2 and 3 show that Rainbow implementation includes two affine transformations with matrix sizes 26 × 26 and 43 × 43, respectively, and 26 MQ polynomial evaluations and solving two systems of linear equations with matrix size The first round of 13 polynomial evaluations 65 (3) The first round of solving system of linear equations 13 (4) The second round of 13 polynomial evaluations 78 (5) The second round of solving system of linear equations 13 × 13.Table 3 summarizes the performance of our implementation of Rainbow signature measured in clock cycles, which shows that our design takes only 242 clock cycles to generate a Rainbow signature.In other words, our implementation takes 4840 ns to generate a Rainbow signature with the frequency of 50 MHz.Among all of the operations, MQ polynomial evaluation occupies most of the executing time.

Comparisons with Other Implementations
The works in [5,6,[26][27][28][29] are believed to be the latest RSA, ECC, and multivariate public key cryptographic systems on hardware, respectively.We compare our design with these systems, which is depicted in Table 4. Comparison results show that our design is more efficient than the related implementations.
Besides, Rainbow implementation of the work in [28] is believed to be the fastest multivariate implementation, and Rainbow implementation of the work in [27] is believed to be the smallest multivariate implementation.Thus, the implementations of the work in [28], the work in [27], and this work show that Rainbow has a good performance on hardware and is a better candidate than RSA and elliptic curves under quantum computer attacks.

Side Channel Attack Considerations
Cryptographic systems must be protected against a wide range of attacks, including side channel attacks.Side channel attack belongs to physical attack, which is any attack based on information gained from the physical implementation of cryptographic systems, rather than brute force or theoretical weaknesses in cryptographic algorithms.The underlying principle of side channel attack is that side channel information such as power consumption, electromagnetic leaks, timing information, or even sound can provide extra sources of information about secrets in cryptographic systems, for example, cryptographic keys, partial state information, full or partial plain texts, which can be exploited to break the cryptographic systems.General classes of side channel attack include timing analysis [31], power analysis [32], electromagnetic analysis [33], fault analysis [34], acoustic cryptanalysis [35], data remanence analysis [36], and row hammer analysis attacks [37].
Fault analysis attacks intend to manipulate the environmental conditions of cryptographic systems, such as voltage, clock, temperature, radiation, light, and eddy current, to generate faults during secret-related computations, for example, multiplications and inversions in a finite field, and observe the related behavior, which may help a cryptanalyst break the cryptographic systems.Fault analysis attacks can be engineered by simply illuminating a transistor with a laser beam, which causes some bits to assume wrong values.The notion of using a fault induced during a secret-related computation to guess the secret key has been practically observed in implementations of the RSA that use the Chinese remainder theorem [38,39].A general fault analysis attack on schemes of MPKC is proposed in [40].The work in [40] has attacked partial secret keys from affine transformations of the multivariate public key cryptographic schemes.
Power analysis attack can provide detailed information by observing the power consumption of cryptographic systems, which is roughly categorized into Simple Power Analysis (SPA) [41] and Differential Power Analysis (DPA) [32].In the family of power analysis attacks, DPA is of particular interest and is a statistical test which examines a large number of power consumption signals to retrieve secret keys.A differential power analysis attack on SFLASH is proposed in [42].The work in [42] has attacked secret keys from SHA-1 module of the SFLASH schemes.A side channel attack to enTTS has been proposed in [43], which uses differential power analysis and fault analysis to attack two affine transformations and central map transformation.The method in [43] shows that it can obtain all secret keys of enTTS.
Since the construction of Rainbow includes two affine transformations and central map transformation, such methods in [40,42,43] have the potential to obtain its secret keys.Thus, we discuss defending against a possible side channel attack for Rainbow and the countermeasure is described in the following: (1) We suppose that ( 0 ,  1 , . . .,  25 ) is the message and each element of  is in ((2 4 ) 2 ).
(4) We compute   =   +  and   =   , where  is a 26 × 26 matrix and  is a vector with size 26.
(6) The first affine transformation  1 has been computed; then we take random bytes for Vinegar variables.(7) We double check the random bytes to protect against fault analysis attacks.
(8) We compute the multivariate polynomial evaluations and solving systems of linear equations until the central map transformation is completed.
(10) We compute   =   and   =   + , where  is a 43 × 43 matrix and  is a vector with size 43.
The work in [40] uses fault analysis to attack the random bytes in central map transformations; thus we double check the random bytes to protect against fault analysis attacks.The work in [42] uses differential power analysis to attack SHA-1 module; thus we take a method to protect affine transformations.However, the countermeasure mentioned above is theoretical; we should be able to implement and verify it on hardware.

Conclusions
In this paper, we present techniques to exploit Rainbow signature cryptographic systems on hardware meeting the requirements of efficient high-performance applications.We propose a general architecture for efficient hardware implementations of Rainbow and enhance our design in three directions.First, we present a fast inversion in ((2 4 ) 2 ) based on binary trees.Second, we present an efficient multiplication in ((2 4 ) 2 ) based on compact construction.Third, we present a parallel solving system of linear equations in ((2 4 ) 2 ) based on Gauss-Jordan elimination.Via further other minor optimizations and by integrating the major improvement above, we implement our design in ((2 4 ) 2 ) on TSMC-0.18m standard cell CMOS ASICs.We use Synopsys Design Vision and the map effort is set to medium.Our design can be generalized with minor modifications that also support FPGAs.
The experimental results show that Rainbow implementation includes two affine transformations with matrix sizes 26 × 26 and 43 × 43, respectively, and 26 MQ polynomial evaluations and solving two systems of linear equations with matrix size 13 × 13.Our implementation takes 4840 ns and 242 clock cycles to generate a Rainbow signature with the frequency of 50 MHz.Among all of the operations, MQ polynomial evaluation occupies most of the executing time.Comparison results show that our design is more efficient than the related implementations.
Moreover, the implementations of a fast Rainbow, a small Rainbow, and this work show that Rainbow has a good performance on hardware and is a better candidate than RSA and elliptic curves under quantum computer attacks.
Besides, Rainbow implementations must be protected against a wide range of attacks, including side channel attacks.We discuss defending against a possible side channel attack for Rainbow and we present countermeasures against fault analysis and differential power analysis attack.

Figure 1 :
Figure 1: A pipelined architecture based on binary trees for computing squares and inversions in (2 4 ).

Figure 2 :
Figure 2: The proposed architecture for parallel solving system of linear equations with matrix size 13 × 13.

Figure 3 :
Figure 3: The flowchart of implementations of Rainbow scheme.

Table 1 :
Parameters of Rainbow signature schemes.

Table 2 :
Implementation Results of rainbow scheme.

Table 3 :
Executing time of the implementation in clock cycles.

Table 4 :
Comparison on public key cryptographic systems.
*The time-area (clock cycle-gate equivalent) product of our implementations is normalized to 1.