Two Improved Multiple-Differential Collision Attacks

,


Introduction
In practice, cryptographic algorithms are widely used in microprocessor, FPGA, and ASIC [1].Over the years, the traditional cryptanalysis technologies [2] analyze the plaintexts and ciphertexts and recover the secret keys by method of mathematics.In Crypto 1999, Kocher et al. proposed power analysis attack [3] which recovered the secret key by analyzing the instantaneous power consumption of a running chip.In 2003, Schramm et al. gave collision attack [4] in which the equality of two intermediate values can be detected.Its primary step, collision detection, can usually be achieved by executing least square method or least absolute deviation [5] between two power traces.In 2007, Bogdanov presented a linear collision attack on AES [6].In 2010, Moradi et al. gave a practical linear collision attack named correlation-enhanced collision attack [7].In CHES 2012, Gérard and Standaert discussed the efficient postprocess on collisions among 16 Sboxes based on LDPC code [8].
In CHES 2008, Bogdanov showed some practical collision detection methods named multiple-differential collision attacks (MDCA) [9] whose idea of voting test seemed to be of much practical value.It consisted of two methods, binary voting test and ternary voting test.However, there exist the following problems in practice, which may lead to the failure of attack experiments.
(i) The variance of power traces with Gaussian noise is not constant.Some countermeasures especially bring high intensity noise in some local sampling points [10,11].(ii) The number of key measurement points is not enough because of the low sampling rate of oscilloscopes.(iii) In some protected devices, the times of encrypting the same set of data repeatedly are limited.In other words, for a fixed plaintext, the number of power traces which can be acquired is limited.So, an efficient collision detection algorithm is required.
Our Contributions.In this paper, we try to overcome the problems above, improve the existing collision detection algorithms, and discuss their countermeasures.
(i) The idea of keypoints voting test which divides the keypoints into some groups of uniform weight for a voting test is proposed.So, all the problems above can  be solved.Subsequently, an experiment environment is built, in which we have verified that the new method can increase the success ratio from 35% to 95%.(ii) We improved the ternary voting test of Bogdanov by establishing the standard templates during preprocess, which reduces the complexity of collision detection and increases the success ratio markedly.Our experimental investigation shows that the number of power traces required in our attack is only 1/4 of the requirement of traditional attack.
Organization.This paper is organized as follows.In Section 2, we review the traditional collision attacks and their collision detection methods, binary comparison and binary and ternary voting test.In Section 3, keypoints voting test is proposed, and the corresponding experiment results are shown.
In Section 4, we improve Bogdanov's ternary voting test and show its theoretical and practical superiority, respectively.Subsequently, we discuss the alternative countermeasures against our attack and show our experiment results in Section 5. Finally, we conclude this paper in Section 6.

Collision Attack and Countermeasures.
The cryptographic device usually includes a cryptographic chip at least, microprocessor or digital logic circuit, in which one or more cryptographic algorithms are running.The attackers are interested in the secret keys stored in the chip [1].In the process of power analysis attack, an oscilloscope can be employed for acquiring the instantaneous power consumption of the chip because different operations or operands may consume different powers in practice.Therefore, the power analysis attacks represented by collision attack [4] and correlation power analysis [12] can be mounted effectively.Take collision attack and AES algorithm [13], for example; the attacker executes the following steps.The first round of AES includes S-boxes, ShiftRows, MixColumns, and AddRoundKey, which is described in Figure 1.Firstly, the attacker chooses two 128-bit plaintexts  and   , encrypts them for  times, respectively, acquires 2 power traces, and averages them, respectively.During collision attack, collision detection is the most important step.In order to decide whether two intermediate bytes  0 =   0 (see Figure 1), the attack considers the similarity between the two averaged traces which follow  0 and   0 , respectively.In this step, a collision detection algorithm is needed, which we describe in Sections 2.2 and 2.3.
Usually, the plaintext  is fixed.The collision must happen because the plaintext   can be changed arbitrarily and the encryption can be repeated over and over again.Once a collision is detected, an equation can be built for reducing key information since one key byte may be expressed by another one [4,6].
In the past few years, some countermeasures are designed against these attacks, which can be classified in reducing the signal-to-noise ratio (SNR) [11], timing disarrangement [14], masking [15], and hiding [16].Generating Gaussian noise especially is widely studied, such as the techniques of shift register lookup tables, RAM write collisions, and short  circuits in switch boxes [11].Furthermore, dummy rounds/Sboxes [17,18] can also reduce the SNR markedly.
In the countermeasures above, amplifying local noise usually brings errors to the traditional collision detection, and some collisions would be misjudged as noncollisions.We discuss solution of this problem in Section 3. [4] adopts averaging method for reducing the noise.Then the "distance" between the two traces is figured out by least square method.Comparing the distance with a predetermined threshold, collision or noncollision can be decided.Figure 2 shows this process.

Binary Voting Test of Multiple-Differential Collision
Attacks.The binary voting test proposed by Bogdanov [9] constructs  pairs by the 2 traces corresponding to operations 1 and 2. Instead of being average, the two traces of each pair are compared, whose result is regarded as a vote (zero or one standing for noncollision or collision, resp.).Finally, collision or noncollision of the two operations can be decided by the sum of vote and a predetermined vote threshold.Figure 3 describes this process, which shows the idea of "multiple-differential. " Let τ1 = { 1 1 ,   the total vote can be summed based on the binary comparison function Ψ BC : Then the vote can be compared with a predetermined threshold for the decision of collision.Consider

Ternary Voting Test of Multiple-Differential Collision
Attacks.During the preprocess of ternary voting test with profiling [9], a set of reference traces is built firstly.Then, the two traces to be detected are, respectively, compared with every reference trace (binary comparison algorithm can be employed for this comparison).So, every reference trace corresponds to two results whose value may be (0, 0), (0, 1), (1, 0), or (1, 1) assuming 0 and 1 denote noncollision and collision, respectively.At last, collision or noncollision of the two operations can be decided by the number of (1, 1) and a predetermined threshold.Figure 4 describes this process.
In reference traces generation stage,  TV plaintexts are chosen and encrypted once.So,  TV traces denoted as  1 ,  2 , . . .,   TV are acquired, which are taken as reference traces. keypoints are selected from each trace.
Let  1 and  2 , respectively, denote the average trace of  traces corresponding to executing operations 1 and 2 for  times.In collision detection stage, for every reference trace   , two binary comparisons are executed, and the two results are multiplied together, which is regarded as one vote: When  traverses from 1 to  TV , the total vote can be summed:  Finally, the vote can be compared with a threshold for a determination of collision: Ternary voting test can also be executed without profiling.In other words, each trace of AES encryption acquired in online stage can be divided to 160 reference traces corresponding to 16 S-boxes in 10 rounds.So, the reference traces generation stage can be omitted.

Keypoints Voting Test
The countermeasure of amplifying local noise in Figure 1 brings errors to the traditional collision detection.Using for reference the idea of multiple-differential, keypoints voting test proposed in this section can solve this problem well because the local noise can only have influence on a small number of votes even if the noise is high enough.

Basic Idea.
After the averages for reducing the noise, the  keypoint pairs from the two traces vote on the collision, which is described in Figure 5.Let  1 = ( 1,1 ,  1,2 , . . .,  1, ) ∈ R  and  2 = ( 2,1 ,  2,2 , . . .,  2, ) ∈ R  denote the averaged traces consisting of  keypoints, respectively.For a keypoint pair ( 1, ,  2, ) ( = 1, 2, . . ., ), the vote is defined as Subsequently, the total votes Θ PV ( 1 ,  2 ) can be summed: Finally, a threshold  PV is adopted for the collision decision: Remark.There is a compromise between keypoints voting test and binary comparison.Assuming that  is divisible by , then the  pairs from the two traces are divided into  groups which correspond to  votes.In each group, the / pairs can be input into binary comparison algorithm, which output a vote.
If the total votes are more than a threshold, collision can be decided.

Experiment and Efficiency.
We adopt EP3C25Q240C6 FPGA of Altera [19] for building the experiment environment, which is described in Figure 6.A resistor of 1 ohm is connected between the power supply and FPGA in series.So, a differential probe connected to an oscilloscope can be employed for acquiring the voltage across the resistance, which is related to the power consumption of FPGA.We implemented AES in Verilog HDL based on FPGA.The power consumption trace of the 10-round encryption can be gotten, which is shown in Figure 7.In the digital logic circuit of AES, we designed a countermeasure according to the idea of Gaussian noise generator [11].Random dummy S-boxes join the computation of round function, which amplifies the noise of power consumed by S-boxes locally.Figure 8 which zooms in the part of the first round in Figure 7 shows the local noise.The variance of amplified noise is five times greater than that of the noise from nonprotected implementation.
In the case of the same operation and operands, we acquired two averaged traces for an experiment.3000 keypoints were selected from each trace.We employed binary comparison and keypoints voting test (every 300 points were regarded as a vote and 10 votes in all) for collision detection.To decide which algorithm was better, we compared the success ratio of them, where success meant the result of detection was collision, the same as the fact.After repeating the experiments for many times, it is shown that the success ratios of binary comparison and keypoints voting test are about 35% and 95%, respectively.Figure 9 shows the relation between number of experiments and success ratio.Obviously, the keypoints voting test can overcome the high intensity noise in some local sampling points better than the binary comparison.
We made ten keypoints voting test for determining the number of ballots.Regarding 3000 points as 1 vote, 2 votes,. .., and 10 votes, respectively, the 10 counts can show the influence on success ratio.If we chose 75% of the total votes as threshold, then the relation between number of ballots and success ratio can be gotten, which is the red line of Figure 10.The blue line means the success ratio of binary comparison which is unrelated to the number of ballots.Obviously, in this environment, dividing 3000 points into more than six votes is scientifically reasonable.

Theoretical Analysis.
Under local noise, the keypoints voting test shows higher efficiency than binary comparison because the vote corresponding to a keypoint limits its influence on the collision distinguisher effectively.Intuitively, let  1 ,  2 , . . .,  10 denote the information (with noise) of ten keypoints.Assume that   follows the normal distribution (,  2 ) for  = 1, 2, . . ., 9 and  10 ∼ (,  2 ), where   ≫ .In binary comparison, the information is accumulated as a collision/noncollision distinguisher which follows the normal distribution (10, √ 9 2 +  2 ).In the case of   ≫ , the distinguisher may show great errors.However in keypoints voting test, no matter how great the   is, the keypoint with great noise can cast only one vote.Therefore, the error of distinguisher will be decreased significantly.

Combined with Other Methods.
As shown in the previous section, the keypoints voting test owns higher efficiency than binary comparison.However in fact, the two methods cost more traces than binary voting test because their averaging process wastes too much information.Fortunately, our keypoints voting test is multivariate, differential, and chosen plaintexts.So, it can improve some other collision attacks by being combined with them.

Improved Binary Voting
Test.The keypoints voting can join the binary voting test [9] inherently because the former regards each point as a vote, and the latter only considers each pair of trace.So, the combined test may be called twodimensional voting test.Figure 11 describes the flow chart of combined scheme.Intuitively, keypoint voting just substitutes of binary voting test.

Improved Correlation-Enhanced Collision Attack.
The correlation-enhanced collision attack [7] compares the similarity between two sets of traces corresponding to two operations.The most similar case will result in the maximal correlation coefficient so that the most likely key guess can be gotten.According to the keypoints voting, multiple votes can be employed for multiple references of correlation coefficient, which is described in Figure 12.But the original correlationenhanced attack only chooses the key corresponding to the maximal correlation coefficient for all the keypoints.

Efficiency Comparisons.
To compare different methods further, we made some simulations in MATLAB for the binary voting test and correlation-enhanced collision attack with/without keypoints voting test.First, we generated 50000 traces, respectively, for two intermediate values  1 and  2 .Each trace consisted of 30 keypoints, which followed the normal distribution ((  ), sigma).Thus, after repeating the attacks for dozens of times, we could get their success rates.We show the relation between number of traces and success rate for binary voting test with/without keypoints voting test in Figure 13 and for correlation-enhanced collision attack with/without keypoints voting test in Figure 14.

Improved Ternary Voting Test
In Bogdanov's ternary voting test, each reference trace seems to be a judge who executes a decision algorithm by the standard of itself.However, this standard contains noise, which is unqualified.What is more, there are so many judges that the algorithm is inefficient.In this section, we discuss this problem.

Basic Idea.
Our improved attack first reduces all the reference traces to a small number of "standard" ones with Collision/not  very low noise.Then they are employed for estimating the collision of two traces.The collision between two S-boxes of AES can be taken, for example.Because of the 8-bit input, the number of reference trace should be set to 256.In the stage of reference traces generation, 256 different plaintext bytes corresponding to the same S-box are input into the device, respectively.Each plaintext byte is encrypted for   times, and the   traces are averaged.So, 256 reference traces are acquired, which are denoted by   ,  = 0, 1, . . ., 255.If   is big enough, the noise will be negligible.
In online stage,  traces are acquired corresponding to operations 1 and 2, respectively.Let  1 and  2 denote the two averaged traces.
In voting stage, for each reference trace   , binary comparison is carried out first: When  traverses from 0 to 255, the total vote can be summed: Then, the collision can be decided according to the following threshold.The whole process is described in Figure 15: Remark.Θ ITV ( 1 ,  2 ) > 1 may mean that the threshold of least square method is too loose.Sometimes, the noise of reference traces may cause this problem.Therefore, more reasonable parameters should be chosen.

Efficiency Comparison.
We discuss two efficiency comparisons for evaluating our new attacks in this section.

Comparing Improved Ternary Voting Test with Ternary
Voting Test.In the stage of reference traces generation, both ternary voting test with profiling and our improved test acquire  TV traces.But average is not employed by the old method, while the improved one executes an average for every   traces (let  TV = 256  ).For the ternary voting test without profiling, the 160 reference traces are from the  traces in online stage (one completed AES trace includes 16 × 10 sections corresponding to 16 S-boxes in 10 rounds).So, in its first stage, no reference traces are acquired.
In online stage, all three methods have the same operations.In voting stage, ternary voting test without/with profiling and our improved scheme carry out 160,  TV , and 256 judgments, respectively, from their corresponding referees.
Assume that the complexity of acquiring a trace, averaging  traces, and a judgment is  acquire ,  average , and  vote , respectively.Table 1 shows the complexity comparison of three methods.The complexity of the old method without/with profiling is (160 − 256) vote − 256  ( acquire +  average ) and 256(  − 1) vote − 256   average greater than the new one.In a high-performance oscilloscope, average is usually executed by hardware, whose complexity is negligible.Even if average is executed by computer,  vote ≫  average also holds.Moreover,  ≫   usually.Therefore, our method is more efficient than the old ones.

Comparing Improved Ternary Voting Test with Binary
Comparison.Let  1 = ( 1,1 , . . .,  1, ) ∈   ,  2 = ( 2,1 , . . .,  2, ) ∈   denote the two averaged traces to be decided, in which every point  , can be expressed as  , =  , + , .Here  , means the ordinate value without noise, and  , is a Gaussian noise whose expectation and variance are 0 and  2  , respectively.Furthermore, we assume that  , forms a tolerance of Δ arithmetic progression when the input of S-box traverses from 0 to 255.When collision takes place, the Euclidean distance from binary comparison follows noncentral chi-squared distribution [20].If enough keypoints are chosen, it follows normal distribution: According to the three-sigma rule [5], this distance lies within the range of (2 Both methods employ least square method and their noise follows the same distribution, so the same threshold  should be chosen for collision detection.If we choose 2 2  + 6 √ 2 2  as threshold, the collision detection criterion of binary comparison will seem too loose, which is undesirable.According to the three-sigma rule, we suggest  2  + 3 √ 2 2  as threshold, which can decide the collision more accurately.Therefore, our new method is more efficient than binary comparison.Furthermore, we should discuss the case of false positives due to a large threshold when noncollision happens.Assuming two reference traces  1 and  2 are adjacent, the range of distance between  1 and  2 is ( In order to avoid false positives, we must have In our practical experiments, we chose  = 18.So it can be simplified further to Assuming  =   /Δ, we have 0 <  < 0.5.Therefore, when  averages are executed such that the noise is reduced to   < Δ/2, collision can be decided correctly with high probability. In our experiment, the standard deviation of original traces    ≈ 5Δ.For two inputs of S-box and two groups of traces (each group included 200 traces), we executed improved ternary voting test and binary comparison, respectively.Figure 16 shows the relation between the success ratio of collision detection and number of averaged traces.Obviously, in our attack, only 100 traces can ensure that the error takes place with negligible probability, which is about 1/4 of the requirement of traditional method.
Remark.The improved ternary voting test can be combined with keypoints voting test.Specifically, every trace in the ternary voting test can be divided into  votes.Then a decision from a reference trace is replaced by  votes, and the threshold can be set to .The combined method possesses better applicability for real environment and can overcome more problems such as local noise and inefficiency.

Discussions of Countermeasures
The attacks presented in this paper defeat the countermeasure of generating Gaussian noise.However, we think there are some countermeasures against our attacks.
(i) Random delays: the traditional countermeasure of random delays tries to complicate data alignment.

Figure 1 :
Figure 1: Round function of AES algorithm and collision attack on it.

Figure 2 :
Figure 2: Flow chart of binary comparison proposed by Bogdanov.

Figure 3 :
Figure 3: Flow chart of binary voting test proposed by Bogdanov.

Figure 9 :Figure 10 :
Figure 9: The relation between number of experiments and success ratio.

Figure 11 :
Figure 11: Flow chart of combination between keypoints voting test and binary voting test.

Figure 12 :
Figure 12: Flow chart of combination of keypoints voting test and correlation-enhanced collision attack.

Figure 14 :
Figure 14: The relation between number of traces and success rate for correlation-enhanced collision attack with/without keypoints voting test.

Figure 15 :
Figure 15: Flow chart of improved ternary voting test including only 256 votes.

Figure 16 :
Figure 16: The relation between the success ratio of collision detection and number of averaged traces.

Table 1 :
Comparison between three ternary voting tests.256   acquire ( acquire +  average ) 256   vote This paper 256   acquire + 256   average ( acquire +  average ) 256 vote Figure 13: The relation between number of traces and success rate for binary voting test with/without keypoints voting test.
1 and the reference trace  1 which is nearest to  1 follows normal distribution: