A Novel Elliptic Curve Scalar Multiplication Algorithm against Power Analysis

Nowadays, power analysis attacks are becoming more and more sophisticated. Through power analysis attacks, an attacker can obtain sensitive data stored in smart cards or other embedded devices more efficiently than with any other kind of physical attacks. Among power analysis, simple power analysis (SPA) is probably the most effective against elliptic curve cryptosystem, because an attacker can easily distinguish between point addition and point doubling in a single execution of scalar multiplication. To make elliptic curve scalar multiplication secure against SPA attacks, many methods have been proposed using special point representations. In this paper, a simple but efficient SPA-resistant multiscalar multiplication is proposed. The method is to convert the scalar into a nonadjacent form (NAF) representation at first and then constitute it in a new signed digit representation. This new representation is undertaken at a small precomputation cost, as each representation needs just one doubling and 1/2 additions for each bit. In addition, when combined with randomization techniques, the proposed method can also guard against differential power analysis (DPA) attack.


Introduction
Since being proposed independently by Koblitz [1] and Miller [2] in the mid 1980s, elliptic curve cryptosystem (ECC) has been widely applied in public key cryptography, especially in pairing cryptosystems [3,4].This is due to ECC using a much shorter key size than other traditional public key cryptosystems such as RSA to provide a corresponding level of security.For instance, 160 bit ECC provides about the equivalent level of security as 1024 bit RSA [5].Due to the shorter key length, higher speed, and lower power consumption, ECC has been attractive for wireless and smart card applications which have limited bandwidth and storage resources.The security of ECC is based on the hardness of the discrete logarithm problem (DLP) on an elliptic curve called elliptic curve discrete logarithm problem (ECDLP) [6].Given a scalar multiplication  = , where  is an integer, and ,  are the points on an elliptic curve, according to ECDLP, if  is large enough, then it is unable to calculate  when the values of  and  are given.When ECC is implemented in the wireless or smart card devices, they are very vulnerable to power analysis attacks [7,8].
In power analysis attacks, which were proposed by Kocher et al. [8], an attacker can obtain the secret key stored inside a device by monitoring the cryptographic device's power consumption.Generally speaking, simple power analysis (SPA) [7] and differential power analysis (DPA) [8] are the two main types of power analysis attacks.SPA can observe secret information through analysis on a single execution of a cryptographic operation, while DPA may need many executions and to analyze them using a statistical process.As different operations executed by the device consume different amounts of time and power, the different time and power consumptions can be used to determine which operations were performed in what order.In the case of scalar multiplication, it may be possible for an attacker to distinguish which parts of the operation were performed by a point doubling, and which parts were performed by a point addition, although he has no knowledge about the private keys.With this the attacker can obtain the secret key, from the acquired information, for using as the scalar in the elliptic curve scalar multiplication.
In the past decade, various scalar multiplication algorithms that resist SPA or DPA have been proposed, for example, [9][10][11][12][13][14][15][16][17].In [10], Coron first generalized a DPA attack to the elliptic curve cryptosystem and introduced doubleand-add always algorithm to resist SPA and randomization method to resist DPA.Since then, many countermeasures, which are based on randomization and scalar recoding, have been proposed.Reference [11] proposed the randomized addition-subtraction chains method, and [17] introduced the randomized window method.Lee first proposed a SPAresistant countermeasure based on multiscalar multiplication [13].Then, [14][15][16] not only introduced multi-scalar multiplication to resist SPA, but also used randomization method to resist DPA.However, those methods provide security at the cost of efficiency.In this paper, we propose a novel efficient SPA-resistant multi-scalar multiplication method and combine it with a randomization method to resist DPA.
The rest of the paper is organized as follows.Section 2 gives a brief introduction to ECC, scalar multiplication, and previous SPA-resistant algorithms.Section 3 describes the proposed scalar multiplication algorithm.Section 4 compares the performance of our strategies with the previous countermeasures.Finally, Section 5 concludes this paper.

Preliminaries
2.1.Elliptic Curve Arithmetic.This subsection presents a brief introduction to ECC.For extended details, the reader can refer to [6,18].Let () be a finite prime field.An elliptic curve  over () can be defined by the Weierstrass equation: where  1 ,  2 ,  3 ,  4 ,  6 ∈ ().The set of points on an elliptic curve  and the point at infinity (denoted by ) form an Abelian group under a point addition operation.The formula for computing point operation consists of two basic operations: the elliptic curve addition (ECADD) when computing + according to a group addition rule when two points  and  on the curve are given and  is not equal to , and the elliptic doubling (ECDBL) when computing 2 when a point  is given.This needs expensive filed inversions in the computation of point operations when using (, ) known as affine coordinates to represent the points on the curve .So, the most efficient implementations adopt representations of the form ( :  : ), known as projective coordinates, including standard projective coordinates, Jacobian projective coordinates, Chudnovsky Jacobian coordinates, and Lopez-Dahab projective coordinates [6].

Scalar Multiplication.
Scalar multiplication is the basic operation in ECDSA signature [19] and ECDH key agreement [20] protocols.The operation calculates the multiples of a point  =  +  + ⋅ ⋅ ⋅ +  ( times), where  is a point on curve  and  is an integer scalar.As the most time consuming operation in the previously mentioned protocols, many algorithms have been proposed to improve the efficiency of scalar multiplication during the past decade.Among them, Algorithm 1: NAF of a positive integer .
Algorithm 2: Binary NAF method for scalar multiplication.
the nonadjacent form (NAF) [21] is the standard one.An NAF of a positive integer  is an expression where   ∈ {0, ±1},  −1 ̸ = 0, and no two consecutive digits of   are nonzero.The computation of NAF of a positive integer  is described as in Algorithm 1.Then, it is possible to compute the scalar multiplication using NAF method following Algorithm 2.
Each positive  has a unique NAF.Among all signed binary representations, NAF() has the fewest nonzero digits.It is known that the average density of nonzero bits of NAF is approximately 1/3.This means that scalar multiplication using NAF needs  ECDBL + (/3) ECADD.

Ciet and Joye's Algorithm.
This algorithm [14] uses the variant of Shamir's double ladder to compute the multi-scalar multiplication 1 + 2.The main difference is to insert a dummy operation in the computation.So, each loop includes one doubling and one addition, and the operation order is DADADADADA in Algorithm 3. Hence, one point doubling and one point addition per bit is needed.

Lee's Algorithm.
To resist SPA, Lee improved the simultaneous scalar multiplication [21] in [13].He changed the values of (  ,   ) when (  ,   ) = (0, 0) to construct another adequate digit pair with at least one non-zero digit.Of course, the adjacent pair ( +1 ,  +1 ) should be modified as well.The transformation rules can be described as in Table 1.
After the transformation, the digit pair (  ,   ) cannot be all zero.Therefore, the modified simultaneous scalar multiplication was proposed by Lee to resist SPA; see Algorithm 4. Obviously, the cost of Lee's algorithm is also one point doubling and one point addition per bit.

2.3.3.
Zhang, Chen, Xiao's Algorithm.This algorithm [16] proposes four scalar multiplication algorithms against power analysis.Those algorithms are all based on the highest-weight binary form (HBF) of the scalars and randomization to resist power analysis.Although those four countermeasures have no dummy operations, the efficiency of them is similar to Ciet and Joye's algorithm.They also almost need one point doubling and one point addition per bit.One of these algorithms can be seen as follows in Algorithm 5.

Liu, Tan
, and Dai's Algorithm.Liu et al. also propose a multi-scalar multiplication to resist SPA in [15].The difference is that they use a joint sparse form (JSF) to represent a pair of integers and process two or three JSF columns each time.Although the processed column number may be different, the algorithm always performs four point doublings and two point additions in each loop.This means that it is not possible for useful information related to the private key to be obtained by the attacker through SPA.
Next,  = 213,  = 408 are selected as a simple example.The JSF of (213, 408) is Then, this algorithm processes the JSF columns as follows: That is to say, it needs four iterations to complete the whole operation; so, sixteen point doubling and eight point addition operations are required.The theoretical analysis and simulation results show that this algorithm needs 1.384 point doublings and 0.692 point additions per bit.

The New Algorithm
Based on the algorithms mentioned earlier, a simple but efficient scalar multiplication algorithm is proposed to resist   known power analysis attacks in this section.The method first transforms the NAF of the multi-scalar (, ) then combines it with the window method and modifies the value of the digit pair with all zero digits; so, the new algorithm can be obtained.

Scalar Representation.
In a scalar multiplication, it can be seen that each bit requires at least a point doubling, while the number of point addition varies in different algorithms.Therefore, the best way to improve the efficiency of the scalar multiplication is to reduce the number of point addition.In this approach, a window method with the NAF form is used to reduce the number of point addition.
First, the window method with the NAF form for a single scalar  is described.Let  be the window size, SHW() the maximum Hamming weight, and SPN() the point number of the precomputation table in each window.The NAF of a scalar  is denoted by  NAF , and generally it can be represented as (2), where  is the bit length of the NAF of .When using the window method, it can be represented as , where  = / and  bit   ∈   , in which   is the set of all possible -bit parts of the NAF integers.Consider In Table 2, we only list the values of SHW() and SPN() for  from 2 to 6, but from ( 6), it can be seen that SPN() rises by times with the increase of .
Next, the window method with the NAF form for multiscalar (, ) is introduced, where  is the window size, MHW() is the maximum Hamming weight, and MPN() is the point number of pre-computation table in each window (Table 3).Now, in each window, there are two scalars; so, MHW() is double SHW(), but MPN() is much larger than SPN().MPN() is the combination of two numbers (SPN  (), SPN  ()).Consider MPN() = SPN  () + SPN  () Table 4: Transformation rules of new algorithm.
After the transformation, the digit pair (  +1 ,    ) adds two more cases, but the pre-computation table only adds one more point due to the different symbol between the two cases.

Proposed New Multiscalar Multiplication Algorithm.
Now, the new algorithm to calculate [] + [] based on the new representation mentioned earlier can be described.The algorithm has a uniform doubling and adding operation, but no dummy operation.
From Algorithm 6, it can be found that two doublings and one addition are performed in each window.It is assumed that the power consumption of subtraction is the same as the addition.There are fifteen points in the pre-computation table.

Proposed DPA-Resistant Scalar Multiplication Algorithm.
Among the various DPA countermeasures [22], random key splitting is the most common method to resist DPA.The scalar  can be split in at least two different ways; one is    = ( − ) + , and the other is  = ⌊/⌋ + ( mod ), where  is random and the length of  is the same of .In this paper, the first way  = 1+2 was chosen, where 1 = −, 2 = .It can be observed that this method is the same as multi-scalar approach.Then, a similar method as in Algorithm 6 can be used to compute the scalar multiplication.The difference is that the point  is equal to .Of course, the transformation rule and pre-computation table are also different.To assist an SPA, 1 and 2 when (1 +1 , 1  ) + (2 +1 , 2  ) = (0, 0) should be converted, so that a real point addition happens.Algorithm 7 describes this in detail.
It can be seen that there are only four points in the pre-computation table for Algorithm 7, but the program sequence is also DDA.Therefore, due to the uniform operation sequence, it can resist SPA, and to ensure that there is no correlation between two times, a random  was inserted.Of course, the attacker cannot obtain any information through DPA.

Performance Comparison
In this section, the performance of Algorithm 7 is analyzed and compared with previous algorithms.In Algorithm 7, each loop processes two bits and has two point doubling and one point addition.Each bit needs one point doubling and 1/2 point addition.In order to show the performance of  Algorithm 7, a comparison with previous methods is listed in Table 6.

Table 1 :
Transformation rules of Lee's algorithm.

Table 2 :
Values of SHW() and SPN() for different window sizes.

Table 6 :
Performance comparison of algorithms.