Bootstrapping of FHE over the Integers with Large Message Space

For the decryption of the fully homomorphic encryption (FHE) over the integers with the message spaceZQ, Nuida and Kurosawa proposed a Q4λ-multiplicative-degree circuit to compute it at Eurocrypt 2015, where λ is the security parameter and the message size Q is a constant. Since the degree of the decryption circuit is polynomial in Q, the range of the message size Q is limited. In this work, we solve this open problem as long as Q is large enough (larger than λ). We represent the decryption circuit as a arithmetic polynomial of multiplicative degree 108 ⋅ λlog3λ, which is independent of the message sizeQ except a constraintQ > λ. Moreover, the bootstrapping process requires onlyO(λ⋅log λ)number ofmultiplications to implement the decryption circuit, which is significantly lower thanO(λ4) of Nuida and Kurosawa’s work.We also show the efficiency of the FHE scheme with message space ZQ compared to the FHE scheme with binary message space. As a result, we have that the former is preferable.


Introduction
In 1978, Rivest, Adleman, and Dertouzos introduced the notion of fully homomorphic encryption (FHE) which can compute any circuit on encrypted data without decryption [1].It solves the ciphertext data calculation and the privacy protection of private cloud user in cloud computing environment.Until 2009, Gentry proposed firstly a fully homomorphic encryption scheme based on ideal lattices [2].
Gentry's Blueprint.First, Gentry constructed a somewhat homomorphic encryption (SHE) scheme, whose ciphertexts contain some noises for the security of the scheme.Noises, however, also limit the number of the homomorphic operations, e.g., ciphertexts multiplications.The second step is squashing the decryption circuit associated with an arbitrary ciphertext to obtain a low enough degree polynomial in the ciphertext bits and the secret key bits, which can be homomorphically evaluated by SHE scheme (called bootstrappable scheme).The last step is Gentry's breakthrough, called bootstrapping, which refreshed ciphertexts by homomorphically evaluating this low multiplicative degree decryption circuit on the encryption of those bits, thus resulting in a new encryption of the same plaintext, but with possibly reduced noise.The refreshed ciphertexts can then support subsequent homomorphic operations.By repeatedly refreshing ciphertexts, the number of permissible homomorphic operations becomes unlimited.So a pure FHE scheme is transformed from the bootstrappable SHE scheme.
Message Space.Practically, the computation over bitwise encryptions is not efficient.It is important to construct the FHE over lager integers for secure integer arithmetic (see [6,13]).Fortunately, it is quite straightforward to extend the message space from Z 2 to Z  for SHE scheme [6,14].But they cannot convert this extended SHE scheme to an FHE scheme via the bootstrapping procedure.Because computing -ary addition seems to need more complex carry computations than binary addition, it seemed technically difficult to obtain a mod- arithmetic circuit that performs the decryption circuit  ←  mod  mod . ( At Eurocrypt 2015, Nuida and Kurosawa [8] proposed a -ary half adder, yielding the carry  in the procedure + =  +  for any ,  ∈ Z  .They determined a carry function where (   )  = ( − 1) . . .( −  + 1)/(( − 1) . . . 1) mod .It has the multiplicative degree .The squashed decryption in [8] works as where  is a constant prime, the secret key s = ( 1 , . . .,  Θ ) ∈ {0, 1} Θ is Θ length vector with Hamming weight  = , and z  = ( ,0 .,−1 , . . .,  ,− )  <  is a real number with  = ⌈log  ⌉ + 2 bits of precision after the -ary point, satisfying ∑ Θ =1   z  ≈ /.The decryption circuit is computed by a mod- arithmetic circuit of multiplicative degree  4  = (), where  is a constant.
In 2017, Cheon et al. [15] presented a faster bootstrapping of FHE over the integers than the previous work in [8].The degree of the decryption is ( 1+ ), and the number of homomorphic multiplications is (log 2 ), where  is some small constant (being affected by the modulus ).
However, the modulus  still needs to be a constant.
For  > 8Θ 2 , Cheon and Kim [16] expressed the decryption circuit as an L-restricted depth-3 (∑ ∏ ∑) circuit by the technique in [17].The L-degree is at most 8Θ 2 and the number of product gates is at most 8Θ 2 + Θ + 1.As we know, Θ is ( 6 log ) in [3] and is reduced to Õ( 3 ) in [4].The decryption is too complexity to bootstrap.So, in the FHE scheme, the ciphertext associated with the large prime message space needs a low-degree decryption circuit.
Efficiency.To evaluate homomorphically a mod- arithmetic circuit, one can use the FHE scheme with message space Z  directly, or one can firstly convert the arithmetic circuit to a Boolean one and carry out all the computation using an FHE scheme with binary message space.At ACNS 2016, Kim and Tibouchi [18] compared the two approaches for the Nuida-Kurosawa scheme, denoted by NK, and showed that the scheme NK with nonbinary message space is less efficient than its variant with binary message space.Fortunately, the bootstrapping method proposed by Cheon et al. [15] is worthwhile for  of constant size by comparing both above approaches for CLT scheme.However, the modulus  still needs to be a constant.
Therefore, it is open for large value of  to express the decryption circuit of FHE schemes with the form (5) as a lowdegree polynomial.

Contributions.
In this paper, we solve this open problem as long as  is large enough (larger than ).
The usual technique for squashing the decryption circuit amounts to homomorphically evaluating a large integer sum of the form ∑ Θ =1   z  , where the   are secret bits and the z  are public constants computed from the original ciphertexts and public parameters.In [8], Nuida and Kurosawa represented the z  's as their Q-ary expansion and applied the mod- circuit for iterated addition.And they have also proved that the degree  of the polynomial  , (, ) computing the carry of -ary half adder is the lowest degree.In order to obtain a low enough degree (be independent of ) of decryption circuit, we cannot deal with the carry bit any more.Instead, in this paper we use z  = ( ,0 .,−1 , . . .,  ,− ) 2 the binary representation of the real number z  .This means that we have to use mod- arithmetic circuit gates to emulate bit operations.Specifically for bits  and , the XOR operation is computed by  ⊕  =  +  − 2 mod , and the AND operation is computed by  ⊙  =  mod .So we can use the mod- arithmetic circuit to implement the decryption circuit.Usually, emulating binary operations are not that efficient since emulating binary addition needs multiplication.The challenge is how to compute it efficiently.
Note that if using only a three-for-two trick, as mentioned in Section 2.2, the decryption can be implemented with a multiplicative degree ( 3 ) of mod- arithmetic circuit, which is better than the result of [16].Our main contribution is reducing the multiplicative degree to ( log 3 ) for any large prime  with a constraint  > .Now let us recall the circuit procedure computing ⌊∑ Θ =1   z  ⌉ mod 2 in DGHV scheme [3].
(3) The third circuit computes ⌊a + b⌉ mod 2 by a polynomial of degree 4.
In this work, we use mod- arithmetic circuit to simulate those bit operations in the above binary circuit.It is easy to simulate the second circuit by applying the three-for-two trick over Z  .It will cost some additional multiplicative degree, since we need an arithmetic polynomial of degree 2 to compute the XOR operation.The third circuit is also easy to be simulated with a polynomial of degree 4.
However, to emulate the elementary symmetric polynomial ( (−) ) in step (1), it will take a polynomial of a high degree (greater than Θ, where Θ is ( 6 log ) in [3] and is reduced to Õ( 3 ) in [4]).This cost is unacceptable.So we need to find a new arithmetic function to compute  −, , the bits in the binary representation of W − .
Our main idea is as follows.
If we know the value of an integer , it is easy to obtain each bit in the binary representation of , but if we only get the range of value of , namely,  <  for some integer , it can be a little tricky to get each bit of .We observe that we can overcome it by applying Lagrange interpolating polynomial, as shown in Section 2.3.Since the Hamming weight of the secret key vector s = ( 1 , . . .,  Θ ) is , the Hamming weight of the vector  (−) = ( 1  1,− , . . .,  Θ  Θ,− ) is not bigger than , namely, W − ≤ .So we can get W − just by using the mod- addition gate to directly add up at the cost of an additional condition that  > .Then, for 0 ≤  ≤ ⌈log ⌉ − 1, we can obtain all bits  −, by applying Lagrange interpolating polynomial on W − .
Conclusion: now we can express the decryption as mod- arithmetic polynomial with a constraint  > .The simulation circuit computing step (1) is  degree of the Lagrange interpolating polynomial.The simulation circuit computing step (2) has the multiplicative degree at most 3 ⌈log 3/2 ⌉+2 .Hence, the multiplicative degree of our decryption circuit is where we set  = ⌈log ⌉ + 3,  = .Moreover, the number of the multiplications required in our decryption is only ( log ), comparable with ( 4 ) in [8].
Efficiency.The arithmetic decryption circuit in NK scheme is not competitive as pointed out by [18], due to the fact that the squashed decryption circuit for NK  has a depth polynomial in .Fortunately, the degree 108 ⋅  log 3  of our squashed decryption circuit is independent of  with a constraint  > .
We use the leveled FHE scheme over the integer proposed by Coron, Lepoint, and Tibouchi, denoted by CLT 2 , and extend its message space to Z  , denoted by CLT  .To state the efficiency of CLT  with our bootstrapping procedure, we compare it with the scheme Conkert-CLT 2 converting the mod- arithmetic circuit to binary and evaluating all the operation using the scheme CLT 2 with binary message space.Here we compare in terms of the ciphertext size and the time complexity of basic operation implemented during homomorphic evaluation.
Then ciphertext size   of CLT  is a little shorter than that of Conkert-CLT 2 , specifically And for some  ∈ [256, 2 81 ], we have 1 < log  ⋅  2 /  < 1.4 when  = 64.The ciphertexts for CLT  and Conkert-CLT 2 are of the same size.Moreover, we denote by   the time complexity of a single ciphertext refresh operation in CLT Q and by   2 the time complexity of carrying out a multiplication mod- in Conkert-CLT 2 (by homomorphically evaluating the Boolean circuit for modular multiplication, with a refresh operation after each  gate).Then we show that For instance,   is faster than   2 by a factor of more than 930, when  ≥ .
Then, we say that a pure FHE scheme with large message space with our bootstrapping procedure is preferable.

The Organization.
We summarize some notations and tricks in Section 2. In Section 3, we express the decryption circuit as a mod- arithmetic circuit of a low enough multiplicative degree.In Section 4, we present an FHE scheme over the integers with bootstrapping for the large prime message space and show its efficiency compared to the FHE scheme with binary message space.Finally, conclusion is given in Section 5.

Bootstrapping the Decryption
This section deals mainly with how to implement the decryption  ← ( mod ) mod  with a mod- arithmetic circuit of a low degree.
We set the bit length of ciphertext  is  <  − 4; thus, | ⋅ | < 1/16.And we observe that | ∑ Θ =1   Δ  | ≤ /16.Since  is a valid ciphertext, satisfying that the value of / is within 1/8 of an integer as the definition in [3]; thus, ∑ Θ =1     is within 1/4 of an integer.Therefore, we have For  ∈ [ where 2 − ∑ Θ =1   z   is within 1/4 of an integer.(Note that most of the context above in this subsection has been described by van Dijk et al. in [3], which is the procedure of squashing the decryption circuit for the case of  = 2.) 3.2.Bootstrapping.For the integer part, we need to compute   z   mod .We can firstly reduce z   with the modulo  and sum up for all , namely, It only takes Θ multiplication-by-constant gates and Θ mod- addition gates.
For the factional part, in order to compute ⌊2 − ∑ Θ =1   z   ⌉, here we firstly construct a mod- circuit that outputs each bit in the binary representation of the sum . ( Since the Hamming weight of the vector sk is , then W − is not bigger than , i.e., W − ≤ .Firstly, compute the sums W − by directly using mod- addition gates, this works since  > .Let W − = ( −,−1 , . . .,  −,0 ) 2 .Then convert the small integer W − into their bit representation by applying the Lagrange interpolating polynomial introduced in Section 2.3; namely, for 0 ≤  ≤  − 1, 1 ≤  ≤ , we have  −, =   (W − ), where the multiplicative degree is .
(2) Now ∑ Θ =1   z   = ∑  =1 2 − W − , which is the sum of  2-bit length of numbers.We can compute it by applying the three-for-two trick over Z  mentioned in Section 2.2 repeatedly, resulting in two numbers t 1 and t 2 satisfying t 1 +t 2 = ∑  =1 2 − W − .Since we need to apply this trick ⌈log 3/2 ⌉ + 2 times, the bit length of t 1 and t 2 becomes 2 + ⌈log 3/2 ⌉ + 2. (3 To evaluate Let { 0 ,  −1 , . . .,  −+1 ,  − = 0} be all the carry bits generated in the addition procedure, where Using mod- gates to compute those bit operations, which is a polynomial of degree 4. For integer part, to implement t  1 + t  2 , we can compute t  1 mod  and t  2 mod  with the stored numbers   ← 2  mod  for  = 1, 2, . . .,  + ⌈log 3/2 ⌉ + 2. Since for an integer  = (  , . . .,  1 ,  0 ) 2 , The modified decryption works as We conclude that the degree of the polynomial in the first step is , the degree of the polynomial in the second step is at most 3 ⌈log 3/2 ⌉+2 < 27 ⋅ log 2.71 , and the degree of the polynomial in the third step is 4. Therefore, the total degree of the decryption circuit over Z  is bounded by ⋅(27⋅log 2.71 )⋅ 4 < 108 ⋅ log 3 .Since we set  =  for security, the degree is at most 108 ⋅  log 3 .So the multiplicative degree of the decryption circuit is ( log 3 ) for any prime  with the constraint  > .
Remark 1.In [4], the authors set  = /log  ( = 15 when  = 72).It means that we can express the decryption circuit of FHE scheme over the integers as a low-degree polynomial over Z  for any  ≥ 16.The multiplicative degree of decryption circuit in [8] is () for the case that  is a constant prime, and 8Θ 2 in [16] for the case  > 8Θ 2 .If  is bigger than 15, our degree of decryption circuit is smaller than that of [8].See Table 1.
Moreover, we reduce the number of multiplications in the decryption circuit which is better than almost previous works as shown in Table 2.Here we have to emphasize that we do not count the number of multiplication-by-constant gates in the decryption circuit to the number of multiplications.Proof.For the integer part, we use Θ multiplication-byconstant gates and Θ mod- addition gates.
For the factional part, in step (1), we apply the Lagrange interpolating polynomial in  variate, which is a polynomial of degree .For a variate , first we compute  1 = ,  2 =  2 , . . .,   =   which requires  − 1 multiplications.So a Lagrange interpolating polynomial consists of  − 1 multiplications,  − 1 multiplication-by-constant gates, and  additions gates.Then we need the  multiplications in step (1).
The 3-for-2 trick over Z  for the bits , ,  needs 4 multiplications gates.Step (2) needs to sum up the  2-bit length of numbers, and for the first time applying this trick, it takes 4 ⋅ 2 ⋅ (/3) multiplications gates, the second time needs to sum up about 2/3 2 + 1-bit of numbers, and it takes 4 ⋅ (2 + 1) ⋅ ((2/3)/3) multiplications gates, and so on.Then it takes about multiplication gates in step (2).

Removing the Constraint 𝑄 > 𝜆.
The constraint  >  is required because we want to compute the bits of the Hamming weight by directly summing up without regard to the carry bits generated from the addition.We observe that the optimization of the binary decryption circuit proposed in [4] does not counter the Hamming weight, so we can remove the constraint.With the three-for-two trick over Z  , we can transform the binary decryption circuit to mod- arithmetic circuit, resulting in more complexity of the decryption circuit.
More precisely, we can divide the secret key sk into  boxes   of  = Θ/ bits each, such that each box has single 1-bit in it.Then Let   = ∑   ∈    be obtained by adding  numbers, with only one being nonzero.So It only requires the mod- addition gates.Then applying three-for-two trick over Z  to add up the  numbers and using the rounding computations in step (3) in last subsection, the decryption circuit is implemented by a polynomial of multiplicative degree 3 ⌈log 3/2 ⌉+2 ⋅ 4 < 108 3 , i.e., ( 3 ).
The goal we describe in this subsection is to emphasize that our work in the last subsection reduces the multiplicative degree of decryption circuit from ( 3 ) to ( log 3 ).

FHE Scheme CLT 𝑄 with Our Bootstrapping Procedure
To show the usefulness of our squashed decryption circuit, we present a variant of FHE scheme over the integers with our bootstrapping procedure and then compare it with the original scheme in binary setting.By Gentry's bootstrapping theory, we can get the "pure" FHE scheme transformed from the somewhat FHE scheme or the leveled FHE scheme.
Here we only describe the latter scheme, since in the former situation the FHE scheme in mod- setting is not perforable to the binary setting.
Here we describe an FHE scheme over the integers with bootstrapping for large prime message space just like Cheon, Han, and Kim did in [15], which is a variant of the FHE scheme presented by Coron, Lepoint, and Tibouchi in [7].
Let  be a bound on the bit length of the noise,  the bit length of the original secret key, and  the bit length of the ciphertext.The parameter  refers to the number of encryptions of zero contained in the public key for encryption, Θ the size of the secret vector,  the Hamming weight of the vector, and  the bit length of the rational numbers in the public key.
These parameters must satisfy the following constraints.
(i)  ≥ , to protect against the brute force attacks on the noise.
(ii)  ≥  + ((log  + log )), where  is the depth of multiplication of the circuits to be evaluated.
(iv)  = +2+2, in order to use the leftover hash lemma in the security proof.

FHE Scheme with
Message Space Z  .In this subsection, we describe an FHE scheme over the integers for message space Z  for a prime modulus , where  can be any prime bigger than the security parameter , denoted by CLT  .
(2) Generate the public key for multiplication.Let k  be a vector of Θ numbers with  bit of precision following the binary point, denoted by with || < 2 − .Then, define where the components of q are randomly chosen from [0,  0 ) ∩ Z and those of r from (−2  , 2  ) ∩ Z.

Correctness.
In this subjection, we prove the correctness of the homomorphic procedure.
Proof.The decryption circuit   has been described in Section 3. We get the refreshed ciphertext   = [⌊∑ Θ =1   z  ⌉]  0 , where the noise of encryption of the secret key bit   is (, 0).We describe the noise increasing with homomorphic evaluations in the bootstrapping procedure.Here we only consider the evaluations for the fraction part in bootstrapping to approximately compute the noise.
In step (1) of Section 3.2, we get the encryption of the Hamming weight of the vector  (−) with noise ( + log Θ, 0).The degree of the Lagrange interpolating polynomial is , which can be implemented by a circuit of depth log .So we obtain the ciphertexts encrypted the bit of the Hamming weight  −, with the noise ( + log Θ + log (log Θ + log  + 8), log Θ).
Note that Θ is chosen so that the sparse subset sum problem is hard.We consider that Θ is unaffected by change of .We set  = , then Θ = ( 2 ) for any  satisfying all the above conditions.
The following theorem holds by the bootstrapping theorem proposed by Gentry in [2].Theorem 5. Our scheme CLT  with the above parameters setting is a pure fully homomorphic encryption.4.3.Security.The FHE scheme over the integers is IND-CPA secure under the AGCD assumption.Our scheme just extends its message space from Z 2 to Z  and combines a squashing procedure before the bootstrapping.Thus, it is easy to see that the following theorem holds.Theorem 6.Under the assumption that both of AGCD and SSSP are hard, our scheme CLT  is IND-CPA secure.4.4.Efficiency.As mentioned in [18], to evaluate a mod- arithmetic circuit with FHE scheme over the integers, one could either use the FHE scheme with large message space directly or first convert the arithmetic circuit to a Boolean one and then evaluate that converted circuit using an FHE scheme with binary message space.
We denote BAdd  and BMult  as the Boolean circuits to perform addition and multiplication on two -bit integers modulus  as in [18], and we have the following numbers of AND gates for BAdd  and BMult  .Proposition 7 (see [18]).For an -bit prime ,   uses 9 AND gates, and   uses 17 2 AND gates.
We denote Conkert-CLT 2 the FHE scheme obtained from CLT 2 with binary message space using the converting circuit with the BAdd  and BMult  .Note that, for the scheme CLT 2 , the decryption is implemented by a circuit of degree of 2 = 2 as presented in [4], in which the number of multiplication equals  2 =  2 .Just like [18], we compare Conkert-CLT 2 and CLT  in terms of the size of the ciphertexts and the time complexity of basic operations carried out during homomorphic evaluation.(40)

Comparing the
In Figure 1 As far as we know, the best time complexity for -bit multiplication is  log 2 (log * ) , where log *  represents the iterated logarithm [20].In our case,  is   and log   Then for  = poly(), even for  = exp(), the effect of log  is dominated by that of , so we can estimate  2 /  by  2 /  , since we can ignore effect of  in the part log   2 (log *   ) .
By Proposition 2, we have that the number of multiplications of the decryption for CLT  scheme is about  log , while for CLT 2 , it is about  2 .On the other hand, we need In Figure 2, we show the value of   2 /  as a function of log  for the case  = 64, which tells us that homomorphic multiplication for CLT  increase performance as  grows.
Remark 10.For  = 64, the value of   2 /  climbs up and then declines as log  grows as shown in Figure 1.If  = 64,  = 256, it is 0.8548 and 1.0007, respectively.The value of   2 /  becomes much large as log  grows and has an upper bound close to 25447 for  = 64 as shown in Figure 2. Table 3 shows the efficiency of CLT 2 measured against CLT  for some primes .

Conclusion
We propose an FHE scheme over the integers with message space Z  for any prime  > .If we set  = , the decryption circuit of this scheme is expressed as a polynomial 10 Security and Communication Networks   of multiplicative degree 108 ⋅  log 3  = ( log 3 ), which is independent of the modulus  except the constraint  > .And we also reduce the number of multiplications in the decryption circuit which is better than most of previous work.
To explore our squashed decryption circuit is worthwhile for large values of , we present a variant of leveled FHE scheme CLT that supports arbitrarily homomorphic operations in the message space Z  for  > .By comparing the two schemes CLT  and Conkert-CLT 2 , we have seen that the two schemes have almost the same ciphertext size, but the former is significantly preferable.

Figure 2 :
Figure 2: The rate of the time complexity for  = 64.

Table 1 :
Multiplicative degree of decryption circuit.

Table 2 :
The number of multiplications in the decryption circuit.
Proposition 2. The number of multiplications in our squashed decryption circuit is at most ( ⋅ log ).
, we show that the value of   2 /  as a function of log  for the case  = 64.It tells us that the ciphertext size of CLT  is a little shorter than Conkert-CLT 2 .When 8 ≤ log  ≤ 81, namely, 256 ≤  < 2 81 , we have 1 <   2 /  < 1.4.Roughly speaking, we say that the ciphertexts for the scheme CLT  and Conkert-CLT 2 have almost the same size.4.4.2.Comparing the Speed of Homomorphic Operations.Now, we would like to compare the speed of homomorphic operations in Conkert-CLT 2 and CLT  .That speed is essentially determined by the cost of homomorphic multiplication modulo .For a given security parameter , and any prime , let   2 be the time complexity of carrying out a multiplication mod  in Conkert-CLT 2 , and   be the time complexity of a single ciphertext refresh operation in CLT  .Proof.For any prime , let   be the time complexity of ciphertexts multiplication.Then   is   multiplied by the number of multiplications in decryption circuit, and   2 is  2 multiplied by the number of AND gates in BMult  circuit.
17 log 2  AND gates for the Boolean circuit BMult  as in Proposition 7. Then we have