An Approximate Fast Privacy-Preserving Equality Test Protocol for Authentication in Internet of Things

1College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China 2Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guilin 541004, China 3Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210023, China 4School of Computer Science, Guangzhou University, Guangzhou 510006, China


Introduction
In recent years, with the growth of privacy concern, privacypreserving computation [1][2][3][4] receives increasing attention, since various privacy-preserving computation schemes can support computation on private data while keeping the privacy of the involved data.Sensitive data collection and analysis over the encrypted data become the current trend [5][6][7][8][9][10][11][12].Based on this situation, Privacy-preserving Equality Test (PET) aims at securely comparing two binary strings which are privately held by two parties.That is, by PET scheme, two participants can securely work out whether their binary strings are exactly equal or not; meanwhile each participant can obtain no useful information about the private binary string of the other participant; even two strings are the same.PET is a significant basic building of many privacy-preserving schemes, such as privacy-preserving authentication [13][14][15], secure comparison of biological characteristics [16][17][18], privacy-preserving machine learning [19][20][21], secure cost comparison in wireless network [22,23], privacy-preserving threshold schema in recommendation systems [3], attribute comparison in attribute-based encryption [24][25][26], and secure query in cloud [27,28].For example, Internet-of-Things (IoT) applications may authenticate users in privacypreserving manner.For completing the authentication, a user needs to submit his/her authentication credential to IoT system, and the system decides whether the user is legal or not by comparing the user's authentication credential with authentication information stored in the system database.As privacy concern, the user cannot reveal his/her authentication credential to the system, and the latter can just access them in encrypted form.Meanwhile, to protect the privacy of the IoT system, any user cannot learn useful information of database stored in the IoT system.This dilemmatic problem can be solved by employing a PET protocol.
As its wide applications, several works have devoted to PET recently.Nateghizad et al. 's scheme [29], denoted as NEL16, is the state-of-the-art approach to achieve PET, which is also the most efficient PET method up to now.NEL16 can be viewed as an improved method of Lipmaa and Toft's PET scheme in [30] denoted as LT13.In LT13 [30], Lipmaa and Toft compute the Hamming distance of two private binary strings in encrypted form.Then, they generate a Lagrange interpolating polynomial that outputs 0 if the input equals 0 and outputs 1 otherwise.Finally, the comparison result is figured out in encrypted form by securely evaluating the Lagrange interpolating polynomial with encrypted Hamming distance as input.Compared to LT13, NEL16 further computes the number of "1" of binary representation of the Hamming distance in encrypted form and uses the number of "1", instead of the Hamming distance, to evaluate the Lagrange interpolating polynomial.Suppose binary representation of the Hamming distance has  bits.The number of "1" must be not bigger than , which can be represented by using just ⌈log 2 ⌉ bits.While  ⩾ 2, it always has  > ⌈log 2 ⌉.Thus, NEL16 requires a lower-degree Lagrange polynomial and can reduce running time.However, NEL16 still cannot achieve practical running efficiency, since computing the number of "1" in encrypted form is also timeconsuming.As shown in [29], while implementing them on a Linux machine of 64-bit microprocessor and 8 GB RAM to compare two 256-bit binary strings, LT13 and NEL16 both cost tens of seconds.Therefore, existing PET schemes still suffer from low efficiency.
In this paper, we propose a new PET scheme, named Fast privacy-preserving equality Test Protocol (FTP), which has high efficiency at the cost of little error rates.In FTP, we randomly convert the original binary strings into shorter ones, then the shorter binary strings are securely compared to decide whether the original ones are the same, by which we can dramatically reduce both computation cost and communication overheads.Although FTP just compares shorter strings, we can ensure the comparison result is exactly correct if the original binary strings are the same or they have an odd number of different bits, and the comparison result has low false-positive rates while they have an even number of different bits.For data privacy, our proposed FTP can achieve provable security, and no private information is disclosed throughout the protocol.In general, our main contributions in this paper can be summarized as follows: (i) We propose a Fast privacy-preserving equality Test Protocol, named FTP, which can achieve much high running efficiency than the state-of-the-art PET schemes.FTP can guarantee an exactly correct comparison result while the involved binary strings are the same or have an odd number of different bits and has a low false-positive rate if the compared strings have an even number of different bits.
(ii) We formally prove the security of FTP and can guarantee no privacy is disclosed throughout the proposed protocol.
(iii) We strictly analyze the accuracy loss of FTP and leverage extensive experiments to evaluate the running cost.The results indicate that FTP is highly accurate and can dramatically reduce running cost.
The rest of this paper is organized as follows.In Section 2, we describe preliminaries and system model.In Section 3, we present our approximate fast privacy-preserving equality test in detail and theoretically analyze its accuracy loss.In Section 4, we formally prove the security of our scheme, evaluate our running efficiency, and compare our scheme with previous ones.In Section 5, we simply review the related work.At last, we conclude this paper in Section 6.

System Model and Preliminaries
2.1.Paillier Encryption System.In [31], Paillier proposes a probabilistic public key encryption scheme with semantic security (Indistinguishability under Chosen-Plaintext Attack, IND-CPA).Its steps are concisely described as follows.
Encryption.Let  0 be a number in plaintext space Z  .Select a random  ∈ Z *  as the secret parameter, then the ciphertext of  0 is  0 =   0   mod  2 .
Decryption.Let  0 ∈ Z  2 be a ciphertext.The plaintext hidden in  0 is In Paillier encryption system, it obviously has where E  (, ) denotes the encrypted result of  using public key  and random secret parameter .That is, the product of ciphertexts of  1 and  2 is a ciphertext of  1 + 2 .Thus, Paillier encryption scheme is additively homomorphic.Further, for any  ∈ Z  , there is i.e., the -th power of E  (, ) is a ciphertext of  * .
Paillier encryption system is a significant secure basic tool of our scheme, which will be utilized to encrypt private data and support necessary computation.For simplicity, we use ⟦⟧ to denote the ciphertext of  encrypted by Paillier cryptosystem, while the random parameter  is no need to be pointed out.

System Model.
In this paper, we consider privacypreserving user authentication in IoT.A user (named Bob) submits a -bit authentication credential to system (named Alice), and the system decides whether the user is legal or not by comparing Bob's authentication credential with the authentication information stored in the system database.
As privacy concern, Bob cannot reveal the authentication credential and authentication result to Alice, and Alice just obtains them in encrypted form.Meanwhile, to protect the privacy of Alice, Bob cannot learn any information of Alice's database.This dilemmatic problem can be seen as a privacypreserving equality test (PET) problem as follows.
Privacy-Preserving Equality Test (PET) Problem.PET involves two parties: Alice and Bob.Alice privately hold -bit binary strings  = ( 1 ,  2 , ⋅ ⋅ ⋅ ,   ) and Bob  = ( 1 ,  2 , ⋅ ⋅ ⋅ ,   ).Here,  and  can be also considered as two integers that belong to [0, 2  − 1].Besides, Bob has a public key pair (, ) of Paillier encryption system, where  is public key and  is secret key.They want to securely compare  and  such that only Alice obtains the comparison result in encrypted form; i.e., Alice gains ⟦⟧ in which Additionally,  should be privately kept to Alice throughout the protocol, and Bob's private string  cannot be disclosed to Alice or anybody else.Neither Alice nor Bob can learn the real value of .

Security Model.
In this paper, we assume that the participants Alice and Bob are semihonest.It means that each participant follows the protocol correctly but records all the received information in the protocol to infer as much information about the private data of the other participant as possible.In [32], Goldreich gives a formal definition of security against semihonest adversaries, which can be described as follows.

Design Goal.
For PET problem shown in Section 2.2, we aim at proposing a new solution to achieve the following security and performance goals.
(i) High Accuracy.The protocol should arrive at a correct output with high probability while both participants exactly follow the protocol steps.That is, the solution should be of high accuracy to output a correct comparison result.
(ii) Input Privacy.Throughout the protocol, each bit of the private inputs  and  should be known to its owner only.That it, any useful information about  cannot be disclosed to Bob, and  cannot be revealed to Alice.
(iii) Result Privacy.Both users cannot get the value of result  in plaintext, and only Alice can obtain the encrypted output ⟦⟧ which is encrypted by Bob's public key.
(iv) Efficiency.The protocol needs to employ a sublinear number of public key encryption and decryption such that it can achieve high running efficiency even while  and  are of hundreds of bits.

Review of LT13 Scheme.
In this following, we will simply introduce the previous PET schemes LT13 [30].Generally, LT13 consists of two stages: (1) Computing the encrypted Hamming distance ⟦⟧ between  and  such that only Alice learns ⟦⟧.During the first stage, Bob uses the public key  to encrypt his private bit   for  = 1 to  and sends each ⟦  ⟧ to Alice.Then, based on (7) and the additively homomorphic property of Paillier encryption scheme, Alice can obtain the encrypted Hamming distance (2) Computing the final result ⟦⟧ which is also known to Alice only.To this end, they first select a -degree public Lagrange interpolation polynomial Namely, we can correctly attain the output by setting  = ( + 1), since 0 ⩽  ⩽ .Second, Alice sets ⟦⟧ = ⟦⟧ * ⟦1⟧, i.e.,  =  + 1, and ⟦⟧ = ⟦⟧  where  =  −1 mod ,  is randomly selected from Z *  , and  is the large integer in the public key.After that, ⟦⟧ will be sent to Bob, who decrypts , encrypts   , and returns the ciphertext ⟦  ⟧ to Alice for  = 2, 3, ⋅ ⋅ ⋅ , .Finally, Alice can gain As can be seen, for a larger , LT13 needs more computation and communication cost.While  = 256, LT13 uses tens of seconds [29], which is far away from being practical.In this paper, we will introduce a new PET scheme which can reduce the number of invoking Paillier encryption system and thus dramatically lessen running cost at the expense of small accuracy loss.
, where It completes the proof.
Observations 1, 2, and 3 show we can approximatively determine  =  by comparing d and 0. Besides, we have since  2  =   and  2  =   .Then, we can get an approximative scheme for securely comparing  and  with high efficiency as follows.
That is, the basic scheme substantially determines  =  by checking d = 0. We will analyze accuracy of the basic approach in Theorem 2.
Due to ||, || ⩽ ; thus  and  can be represented by using ⌈log 2 ⌉+1 bits, in which ⌈log 2 ⌉ bits represent the value of ||, || and one bit is used to denote their sign.While  ⩾ 4, it always is ⌈log 2 ⌉ + 1 < .For example, when  = 256, we have ⌈log 2 ⌉ + 1 = 9.Therefore, our basic scheme can dramatically reduce the running cost.

Analysis Evaluation
4.1.Security.We prove the security of our proposed scheme FTP through the following Theorem 4.

Theorem 4. Our proposed scheme FTP discloses nothing useful about the privacy of input values and the final result.
Proof.We will discuss the view of Alice and Bob, respectively.
To sum up, the privacy of Alice and Bob both can be preserved in our scheme FTP, which completes the proof.

Computation and Communication Cost.
In this section, we will analyze the computation complexity and communication overheads of our proposed FTP in detail.
Computation Complexity.Since simple addition and multiplication are much cheaper than encryption, decryption, and ciphertext multiplication of Paillier cryptosystem, we will ignore the simple addition and multiplication in the protocol.Throughout FTP, Bob encrypts each    and   for  = 1 to 2(⌈log 2 ⌉ + 1) and decrypts one times to gain .Alice uses ⟦   ⟧ and ⟦  ⟧ to compute ⟦⟧ and ⟦⟧, which requires ciphertext multiplication 2(⌈log 2 ⌉ + 1) times.In total, both Bob and Alice just employ Paillier encryption system (log ) times.
Communication Overheads.In our scheme FTP, Alice and Bob need to transmit , , ⟦   ⟧ and ⟦  ⟧ for  = 1 to 2(⌈log 2 ⌉ + 1).If each ciphertext is -bit, then the total communication overheads are 2 + 2(⌈log 2 ⌉ + 1).While  = 256 and we set the public key of Paillier encryption system to be 2048 bits, the communication overheads will be 37376 bits.

Experiment Results
. We implement our scheme and two existing efficient algorithms: LT13 and NEL16, using C language.During executing our scheme, we utilize GMP library [33] and Paillier library [34] with key size of 2048 bits.All experiments are performed on an Apple computer with macOS Sierra 10.12.6,Intel Core i5 1.6GHz CPU and 4 GB memory.Alice and Bob communicate through the socket where ping time is about 0.81 seconds.
Figure 2 shows the runtime of LT13, NEL16, and our scheme FTP while the compared string is of 16 to 256 bits.As can be seen, FTP can dramatically reduce the running time compared to LT13 and NEL16.When the length  is 256, LT13 costs about 25 seconds, NEL16 takes 6 seconds around, and FTP just needs 0.6 seconds.While the length is larger, the advantage of FTP will be more salient.The main reason is that we transform the original ,  into   ,   which is much shorter than the original ones.More importantly, our transformation just involves simple addition and multiplication and can be completed rapidly.In FTP, Paillier encryption system is employed only to securely compare   and   .Therefore, FTP can reduce the running cost, especially when  is large.If the bit length is smaller than 16, FTP has no significant advantages on running time, and LT13 or NEL16 is suitable for the short-string equality comparison scenario.4.4.Improvement.Though our scheme FTP can reduce the cost, it still takes  bits to transmit the vector  or .We can further improve the scheme to avoid transmitting  or .Let () : {0, 1} * → {0, 1}  be a pseudorandom function.Alice and Bob, in advance, select a constant .While they decide to compare the private binary vectors, they can separately generate a random binary string ( ‖ ) where  denotes the time they decide to implement the protocol and ‖ denotes concatenation.Then, they set   =   in which   denotes the -th bit of ( ‖ ).Since   = (  (2  − 1) + 1)/2 and (2  − 1) 2 = 1, Alice can locally get   = (2  − 1)(2  − 1).Thus, Alice and Bob can compute  and , respectively.By this method, Alice need not to send the vector  again.For , they can preestablish another constant   and use it avoid transmitting  by a similar method.

Related Work
Privacy-preserving string equality test is one of secure multiparty computation (SMC) problems, and it has wide applications in various privacy-preserving scenes [35][36][37][38].Up to now, a big number of works can be utilized to achieve privacy-preserving string equality test.We simply discuss the previous schemes as follows.
In 1982, Yao [39] proposes the first SMC problem, Millionaire problem and gives a secure solution.After that, garbled circuits method [32,40] is put forward to securely evaluate a general function.Nevertheless, the general approach is too expensive and can just theoretically solve the problem.Scalar product protocol (also known as dot product protocol) focuses on computing the scalar product of two private vectors with privacy-preservation. Privacy-preserving string equality test can be achieved by invoking scalar product protocol.We thus review the main solutions of scalar product protocol.In [41], Vaidya et al. proposed a scalar product protocol based on algebraic transformation.By Security and Communication Networks using homomorphic encryption, two solutions for securely computing dot product of private vectors are given in [42] and [43], respectively.A polynomial secret sharing-based scalar product protocol is presented by Shaneck and Kim [44].Nevertheless, the schemes either are not provably secure or have heavy computation and communication overheads.Recently, Zhu et al. propose two efficient solutions for secure scalar product protocol [45,46], which can be utilized to securely compute the Hamming distance of two private strings but cannot support the distance comparison.Cheng et al. [47] review the approaches to secure Internet of Things in a quantum world.In [48], Li et al. leverage Paillier encryption to achieve secure comparison protocol, based on which they also propose a secure SVM classification scheme.Nevertheless, the comparison scheme in [48] focuses on securely figuring out the bigger one from two private integers but cannot directly support the equality comparison problem investigated in this paper.
In [30], Lipmaa and Toft propose a secure string equality test scheme based on Paillier encryption scheme [31].While comparing -bit strings, Lipmaa and Toft's scheme requires () encryption of Paillier encryption system and thus is time-consuming.Nateghizad et al. [29] improve Lipmaa and Toft's scheme by reducing the degree of Lagrange interpolation polynomial.As yet, the number of invoking Paillier encryption in Nateghizad et al. 's solution is also linear with , which is not suitable for a large  either.In general, the existing privacy-preserving string equality test schemes are still far away from being practical.

Conclusions
In this paper, we considered efficient and privacy-preserving authentication in IoT applications.To this end, we proposed a new privacy-preserving equality test protocol, which can securely complete string equality test and achieve high running efficiency at the cost of little accuracy loss.We strictly analyzed the accuracy of our proposed scheme and formally proved our security.Additionally, we leveraged extensive simulation experiments to evaluate the running cost, which confirms our high efficiency.

10 :Figure 1 :
Figure 1:  2 0 : the error probability of using double  and , while  and  are inequal.