Multiparty Homomorphic Machine Learning with Data Security and Model Preservation

With the widespread application of machine learning (ML), data security has been a serious issue. To eliminate the conflict between data privacy and computability, homomorphism is extensively researched due to its capacity of performing operations over ciphertexts. Considering that the data provided by a single party are not always adequate to derive a competent model via machine learning, we proposed a privacy-preserving training method for the neural network over multiple data providers. Moreover, taking the trainer’s intellectual property into account, our scheme also achieved the goal of model parameter protection. +anks to the hardness of the conjugate search problem (CSP) and discrete logarithm problem (DLP), the confidentiality of training data and system model can be reduced to well-studied security assumptions. In terms of efficiency, since all messages are coded as low-dimensional matrices, the expansion rates with regard to storage and computation overheads are linear compared to plaintext implementation without accuracy loss. In reality, our method can be transplanted to any machine learning system involving multiple parties due to its capacity of fully homomorphic computation.


Introduction
With the continuous development of artificial intelligence, data have become precious resources due to their value for mining. Nevertheless, numerous private information is embodied as data, which may be abused to violate personal privacy, business secrets, or even state secrets. For example, once a patient's medical record is exposed to insurance companies, they may never sell him some kind of medical insurance [1]. Similarly, many other machine learning applications have also caught sight of privacy infringements, such as financial analysis, product customization, and public opinion surveillance [2][3][4]. On the other hand, any datadriving mechanism heavily relies on the quantity and quality of information, which brings about the conflict between data usability and data confidentiality. Fortunately, secure multiparty computation (SMC) [5][6][7] and homomorphic encryption (HE) [8,9] provide us powerful tools to process data in a concealed manner. erefore, the remaining problem to address is how to devise a cryptosystem that is applicable for machine learning in consideration of storage and computation overheads.
As a cryptographic technology orienting decentralized systems, secure multiparty computation aims at data confidentiality for distributed participants. Despite the privacy concern, the involved parties can still figure out a public output as they wish. Based on such cryptosystem, F.Ö. Çatak et al. [10] proposed a privacy-preserving learning protocol for classification in virtue of vertically segmented data from multiple parties. Since the data is just partially shared without concealing, semantic security is unachievable as plain data dose. e first provable secure ML protocol of this kind is presented by R. Devin et al. [11] for text classification. Howbeit, their research only focused on the privacy of data classification and left the learning process unaddressed.
Oriented at centralized systems, homomorphic encryption is another way towards secure machine learning, which is capable of performing specific operations over ciphertexts. Researches of applying FE for data privacy during machine learning have developed rapidly since the significant innovation [12] appeared in 2016. Y. Aono et al. [13] combined the additive homomorphism with deep learning to narrow the gap between system functionality and data security, by applying FH technology to asynchronous stochastic gradient descent algorithm. F. Bourse et al. [14] improved the FHE structure of Chillotti et al. [15] and proposed a homomorphic neural network evaluation framework, namely, FHE-DiNN. Its complexity is strictly linear in network depth, but the model parameters must be proactively predefined. Based on a multikey variant of two HE schemes [16,17] with ciphertexts packed, H. Chen et al. [18] provided a suite of interfaces for secure machine learning which also exploited bootstrapping for arbitrary circuit evaluation. As matter of fact, almost all existing FHEbased machine learning algorithms are based on the algebraic structure of lattice, such as BGV [19][20][21], CKKS [22][23][24], and NTRU [25][26][27]. ese methods suffer from a common defect that decryption may fail due to noise growth. ough bootstrapping can be deemed as an effective tool for noise control, its extra computational burden is hardly acceptable. Surprisingly, J. Li et al. [28] discovered an alternative tool, saying Conjugate Search Problem, to actualize full homomorphism without noise interference. ey also applied such cryptosystem for privacy-preserving data training, which achieved the same accuracy as the plaintexts used for learning. ough more comprehensible and effective than latticebased secure machine learning, Li's scheme can only be applied to the scenario of a signal data provider. Ordinarily, one party can always provide a small quantity of data which may incur an overfitted model. To ensure the generalization of machine learning, data from diverse sources should be gathered for a specific learning task. In the circumstances of multiparty secure machine learning, each data provider may conceal their information by a dependent key. erefore, a training framework that operates over heterogeneous (i.e., encrypted by different keys) ciphertext is desiderated. Conversely, the parameters of the system model should be taken as assets held by the trainer as in general business operation. us, we should also make sure that the machine is concealed, even when not thoroughly trained.
To preserve the privacy of all participants, this paper presents a complete machine learning mechanism in virtue of CSP and DLP hardness. Our contributions are summarized as follows.

Contributions
(1) We coded float-type data as low-dimensional upper triangular matrices that are homomorphic under the operations of addition, subtraction, multiplication, division, and comparison. With the help of CSP, the plain matrices can also be projected to semantically secure ciphertexts homomorphically under the same kind of operations. at is to say, our basic cryptosystem is fully homomorphic, since addition and multiplication are simultaneously implemented. erefore, we can realize secure training and classification/regression once private data are provided under the same key.
(2) We constructed a cyclic group by lifting the plain matrices to a Galois domain. ereafter, key switching (switch a ciphertext encrypted by one key to another) is made possible via DLP for the purpose of cooperative training.
(3) We combined the two aforementioned technologies and devised a secure machine learning protocol under semihonest model, which preserves the privacy of multiple data providers as well as that of the trainer.

System Model
Neural network (NN) is employed as the engineering background and verification model in this paper due to its extensive application. Nevertheless, it is worth mentioning that our scheme can be applied to most machine learning algorithms if privacy is significant to multiple participants.

Neural Network Model.
A typical neural network contains three or more layers, which turns into a deep learning model if hidden layers are multiple [29]. e certain principle of NN lies in the fact that numerous neurons can automatically extract features of the inputs layer by layer.
Besides the topology of NN, the most important factors that defined it are the weights and bias designated to each link and neuron. As for learning, the essence is how to adjust these parameters in virtue of training data via iterative forward-/back-propagation. ereafter, to securely implement a neural network model, we should homomorphically evaluate the following functions.
Forward calculation (e.g., sigmoid): where a i and w i are the input and weight vectors corresponding to the proactive links of neuron i, while b i represents its bias. Backward calculation: Loss function (e.g., quadratic loss function): where t n is the target value and o n is the actual value. Parameter adjusting (e.g., gradient descent):

Mathematical Problems in Engineering
where E k is the error vector between the target value and the actual value, O T j is the transpose of the output of the current layer node, and O k is the output of the node of the next layer.

System Model and Security
Goal. In our system, a powerful trainer expects to acquire a neural network whose topology is predefined. To ensure the completeness of the resultant model, they may request multiple parties for training data. However, the data providers concern about privacy leakage though they have strong wills to cooperate. Meanwhile, the trainer also worries that the system parameters may expose and infringe their intellectual property. erefore, we should preserve the privacy of all participants and guarantee the functionality of machine learning at the same time. Moreover, taking the trained neural network as a service, a user may not only desire to designate a classification/regression task to the server but also be anxious about data abuse.

Adversary Model.
Suppose that the trainer and all data providers are honest but curious during the whole process.
at is to say, they will completely follow the protocol to avoid unnecessary disputes but may be interested in the privacy contained within the data. Furthermore, it is reasonable to assume that both the trainer and data owner are provided with PPT (probabilistic polynomial time) computational power. However, since the trainer is always better equipped than data providers, the hypothesis that they have the accessibility to a quantum machine may also be valid. To define the success of privacy violation, we exploit the concept of symmetric IND-CPA (indistinguishability under chosen-plaintext attack) as below. [30]. Define

Symmetric IND-CPA
for any PPT adversary A that queries the oracle HE.Enc k (·) polynomial times. us, the adversary's advantage can be expressed by en the cryptosystem SE is IND-CPA-secure if Adv CPA HE,A (κ) < ϵ(κ), where ϵ(κ) stands for a negligible function in the security parameter κ.

Cryptographic Construction
Focusing on the security goals presented in the system model, we are now ready to construct our cryptographic building blocks. In this part, we first explore the homomorphism of conjugate search problem to underpin the functionality of training over homogeneous (i.e., encrypted by the same key) ciphertexts. en, we present a key switching technology that can convert a ciphertext encrypted by one key to be decryptable by another.
Conjugate search problem is a special form of group factorization problem (GFP) [31], defined as follows. [31].

Conjugate Search Problem (CSP)
B. Evgeni [32] proved that the CSP is postquantum secure over the general linear group GL d (R) (R means real number field) if d ≥ 4. Hence, to assure system security, we should code the message as a matrix with degree larger than 4.
To protect the privacy of data providers without affecting the accuracy of training, we resort to homomorphic encryption that is capable of actualizing the forward-/backward-propagation processes covertly. ereafter, we devised a way that makes CSP semantically secure and homomorphic. It is worth noting that the conjugate search problem is resistant to quantum attacks, which dispels the privacy concern for data providers even if the trainer is extremely equipped.
A typical homomorphic encryption algorithm can be noted as a tetrad HE � (HE.KeyGen, HE.Enc, HE.Dec, HE.Eval), standing for the functions of key generation, encryption, decryption, and evaluation, respectively.
For any data m over the message space R, we first code it as an upper triangular matrix M ∈ R 6×6 as follows.

Encoding.
Convert the message m into three pairs of random numbers (a 1 , a 2 ), (a 3 , a 4 ), and (a 5 , a 6 ), satisfying a 1 + a 2 � m, a 3 + a 4 � a 5 + a 6 � r, and (a 2 3 − a 2 4 )(a 2 5 − a 2 6 ) � 1, where r is a constant random number of the system. us, we can construct the following matrices: Combining the above matrices, the message m is finally coded as Mathematical Problems in Engineering where 0 represents the 2 × 2 all-zero matrix and R i (i � 1, 2, 3) stands for random matrices uniformly sampled from R 2×2 . For clarity, we denote the space of coded messages as Γ. It is interesting that Γ naturally constitutes a multiplicative cyclic group (excluding the elements whose determinants are zero) and R ∼ Γ (homomorphic). Furthermore, it is well known that all square matrices with the same dimension compose a ring.
ough Γ⊆R 6×6 and its elements are commutative for multiplication, there is an overwhelming probability that a matrix P uniformly sampled from R 6×6 is noncommutative with the coded message M. ereupon, a CSP-based fully homomorphic encryption algorithm can be actualized as below.
3.3. Key Generation. HE.KeyGen(1 κ ): uniformly sample a matrix from R 9×9 , which can also be represented as a combination of nine 6 × 6 random matrices, namely, e probability that P is communitive with elements in Γ should be negligible. en, the algorithm takes k � P as the symmetric key.

Encryption. HE.Enc(P, M): output
as the ciphertext of message m (coded as a matrix M).
en, figure out m � a 1 + a 2 to recover the plaintext.
3.6. Evaluation. HE.Eval(f, C 1 , . . . , C l ): We describe the very basic operations underpinning formulae (2)-(4) in advance. Suppose that C 1 and C 2 are ciphertexts corresponding to m 1 and m 2 under the same key; the additive and multiplicative arithmetic can be simply carried out by C add � C 1 + C 2 and C mul � C 1 C 2 . ese two operations can be trivially assembled to realize the functions for backward propagation. However, since the exponential operation cannot be implemented directly via homomorphic addition and multiplication, some activation functions of forward propagation such as sigmoid should be approximated as the form of polynomials. ereby, we resort to a specific conversion [32][33][34], to replace Noting that the aforementioned formula is expressed as a piecewise function, to homomorphically decide which subfunction should be carried out, we can encrypt the numbers of −1.5 and 1.5 and compare them with x for branching.
To program a piecewise function, J. Li et al. [28] presented a homomorphic algorithm that covertly compares the size between two ciphertexts. ough our scheme is similar to that of [28], we argue that their cryptosystem is not semantically secure because a 2i−1 + a 2i � m and a 2i−1 is always bigger than a 2i for i � 1, 2, 3. [28]. By computing

Security Analysis of
where R * i is also a random matrix, for where and R i (i � 1, 2, 3) is uniformly sampled from R 2×2 , the adversary carries out a chosen-plaintext attack such as the following.
Exp CPA HE,A (κ): , k←HE.KeyGen(κ), where since a 2i−1 > a 2i is guaranteed throughout Li's scheme [28], (( a * 2i−1 − a * 2i )t + n( qa 2i−1 ′ h − a 2i ′ ) must be positive. It is obvious that det(P)det(P − 1 ) � 1; hence, the adversary can easily determine whether m * � m 0 or m * � m 1 by checking the sign of det(C * − TC ′ ). at is to say, It seems that the conflict between piecewise function evaluation and IND-CPA security is infeasible to address. However, we can introduce a specific form of ciphertext which can be used to encrypt a designated number and compare it with any other normal ciphertext. Our construction is given below. e data provider randomly chooses a nonzero number k ∈ R − 0 { } and encrypts m ′ as for which satisfies To compare C ′ with general cipher C * without decryption, the evaluator computes and thus achieves 3.8. Correctness. e correctness of encryption and decryption algorithms is straightforward, so we only focus on the homomorphism of evaluation.
Homomorphic addition: since we can decrypt it as Homomorphic multiplication: because Mathematical Problems in Engineering we can deduce that Homomorphic comparison: on the premise of det(P)det(P − 1 ) � 1, it can be seen that According to formula (21), we have Recall that In terms of the condition that (a 2 3 − a 2 4 )(a 2 5 − a 2 6 ) � 1, we can reduce formula (30) to It is obvious that the signs of Δ and m * − m ′ are exactly the same, since k 2 > 0, which determines the relationship between m * and m ′ without decryption.

Security.
anks to the hardness of Conjugate Search Problem, an adversary must find P such that P − 1 CP � M to recover the plaintext. As for the semantic security of our scheme, it can be seen that (( a * 2i−1 − a * 2i ) + (a 2i−1 ′ − a 2i ′ ) ) in formula (17) is not always positive due to arbitrary relationship between a 2i−1 and a 2i . erefore, when an adversary executes a chosen-plaintext attack as mentioned before, their advantage is negligible. Noting that any normal ciphertext can just be compared with specifically encrypted messages without decryption, the data provider has full control over their privacy and permits exact comparisons only if necessary.
After each training, the neural network coefficients are concealed by the key of the data provider. When multiple data providers take part in the training process, those semimanufactured parameters should also be re-encrypted under the key of subsequent data holder for homomorphic computation. erefore, we devised a way to decrypt and reencrypt the machine coefficients without exposing them to data providers, in consideration of the trainer's property right. Our key switching scheme is based on the hardness of Discrete Logarithm Problem (DLP).

Discrete Logarithm Problem.
Given a cyclic group G, a generator g ∈ G, and a random element h ∈ G, it is difficult to find the discrete logarithm a such that g a � h.
Accordingly, if an adversary has obtained a ciphertext y � h b � g ab ∈ G, it is hard for them to recover h because of the confidentiality on ab [35]. However, in light of the Lagrange theorem [36], we can exploit a trapdoor to reverse y back to h.

Lagrange eorem.
Denote H as a subgroup of finite G; then, |H|||G|, for |H| and |G| are the orders of groups H and G.
Since any h ∈ G generates a subgroup H⊆G via H � h a |a ∈ Z { }, we can conclude that h |G| � e in terms of the Lagrange theorem, where e is the identity of group G.
Based on the aforementioned mathematical tools, we are now ready to construct our key switching scheme as a triad KS � (KS.KeyGen, KS.CSPtoDLP, KS.DLPtoCSP). Without loss of generality, we denote k t � (b, s), k A � P A , and k B � P B as secret keys belonging to the trainer T and two data providers A and B, respectively. en KS.KeyGen can be used to generate the encryption/decryption key pair for the trainer, while KS.CSPtoDLP is used to convert a ciphertext C A encrypted by k A to be decryptable by k t and KS.CSPtoDLP is utilized to modify C t (encrypted under k t ) as C B whose corresponding key is k B .
3.12. Key Generation. KS.KeyGen(1 κ ): as mentioned before, we denote the space of coded messages as Γ. Suppose that the precision of matrix elements in HE is l-bits whose integer part is m-bits and the decimal part is n-bits. We can multiply any coded plaintext M by 2 n to lift it over Z 6×6 2 l . Accordingly, the message space is changed to a cyclic group Γ ′ for |Γ ′ | � 2 12l (2 l − 1) 3 . Moreover, for each 2 n M i , it composes a group Γ i ′ satisfying |Γ i ′ | � 2 l − 1. ereby, we uniformly sample an odd number b ∈ Z 2 l −1 and compute s ∈ Z 2 l −1 such that s · b � 1 mod (2 l − 1). Output k t � (b, s) as the key to the trainer. 6 Mathematical Problems in Engineering

Switching C A to C t KS.CSPtoDLP(C A ).
e trainer T changes the encrypted model parameters C A as C A ′ � 2 n C A mod 2 l ∈ Z 6×6 2 l and sends C At � (C A ′ ) b mod 2 l to data provider A. On receiving C At , A computes C t � P −1 A C At P A mod 2 l as their response.
3.14. Switching C t to C B KS.DLPtoCSP(C t ). On receiving C t from the trainer T, the data provider B computes their response as C tB � P B C t P −1 B mod 2 l . erefore, the trainer T can reverse C tB back to a ciphertext C B � P B MP −1 B purely encrypted under k B via C B ′ � (C tB ) s mod 2 l and then rightshift its elements by n-bits.
where k i are integers for i � 1, 2, 3. During the encoding process in HE, it is easy to choose a 2i−1 such that 2 n a 2i−1 ≠ 0 mod 2 l . Considering that a 1 + a 2 � m and a 2i−1 + a 2i � r for i � 1, 2, the space of 2 n M i must be a cyclic group Γ i ′ for |Γ i ′ | � 2 l − 1. According to the Lagrange eorem, it can be seen that (2 n M i ) 1+k i (2 l − 1) � 2 n M i mod 2 l ; thus C B ′ � P B (2 n M)P −1 B mod 2 l . By right-shifting n -bits on C B ′ , we obtain C B � P B MP −1 B .
3.16. Security. Note that, after receiving C At , the trainer can trivially compute M � (C s t mod 2 l )/2 n to recover the message. Nevertheless, since the model parameters are of their intellectual property, such operation does not conflict with our security goal.
As for data providers, they can just witness an exponential form of the plaintext (i.e., C t � (2 n M) b mod 2 l ). According to the hardness of DLP, the information about message M will not be exposed.

Privacy-Preserving Machine Learning with Multiple Data Providers
To preserve privacy for machine learning, many cryptographic training and classification/regression methods have been proposed in the scene of a single data provider. In most cases, data should be sourced from multiple providers to guarantee the generality of training. erefore, we present a secure machine learning mechanism with the capacity of training as well as classification/regression in consideration of data and parameter privacy. As for training, the cloud is supposed to obtain model parameters with the help of labeled data. During the initialization phase, the trainer T computes k t ←KS.KeyGen(1 κ ) for key switching and each data provider i generates k i ←HE.KeyGen(1 κ ) for homomorphic training.
Denote the encoded training data owned by provider i as M i and the system parameters as M t . e server primarily encrypts the initialized system coefficients (may contain some private intellectual property information) as C t � (2 n M t ) b mod2 l to the first data provider who executes C 1 ←HE.Enc(k 1 , M 1 ) and C 1 ←KS.DLPtoCSP(C t ) as their response. On encrypted data C 1 and C 1 corresponding to the same key k 1 , the cloud can thus achieve C 1 ←HE.Eval(f training , C 1 ) which are updated system parameters decryptable by k 1 .
For clarity, we describe the above processes as shown in Table 1.
Note that KS.DLPtoCSP(·) is a protocol that should be carried out by both the data provider and the cloud.
To make the updated coefficients homomorphically computable with data encrypted by the following providers, we can exploit the key switching scheme to re-encrypt it. Without loss of generality, the updated parameters under key k i will be represented as C i . By means of C t ←KS.CSPtoDLP(C i ) and C i+1 ←KS.DLPtoCSP(C t ), the cloud can obtain the re-encrypted coefficients C i+1 with the help of successive providers. After receiving C i+1 from the next provider, they can compute C i+1 ←HE.Eval(f training , C i+1 , C i+1 ) since both ciphertexts are encrypted by k i+1 . In consideration of the final parameters C N , the cloud needs to execute C t ←KS.CSPtoDLP(C N ) with the last provider and then computes M � (C s t mod 2 l )/2 n to restore the plain parameters. e subsequent training and recovering processes are presented in Table 2.
e classification/regression process is straightforward that, on encrypted data C u ←HE.Enc(k u , M u ) and system parameters C u ←KS.DLPtoCSP(C t ) for C t � (2 n M t ) b mod 2 l , the cloud can homomorphically compute C u ←HE.Eval(f cla/reg , C u , C u ). By decrypting the received C u , the user obtains the classification/regression result such that M cla/reg ←HE.Dec(k u , C u ). is process can be found in Table 3.

Experiment Analysis
We drew support from the power load data of Chongqing Tongnan Electric Power Co., Ltd., dating from May 4 to May 10 in 2015, to verify the effectiveness of our training method. A short-term electrical load prediction model is also testified in virtue of 96 historical data pieces sampled during 4 consecutive days. e original machine learning model is exactly the same as that of [29], which has considered nothing about privacy. Our experiment environment is shown in Table 4.

Mathematical Problems in Engineering
To simulate the scenario of multiparty machine learning, we divide the data into three parts and realize the training process corresponding to 3 different keys in HE. To prove that our method is not harmful to the accuracy of the trained network, as is shown in Figure 1, we compared the prediction result directly achieved via original model (without privacy-preserving) with that of ours (privacy-preserving scheme). Figure 1 illustrates that the two results are completely consistent.
e experimental results are shown in Table 5; our scheme can perform encryption training and prediction for multiple data providers in general machine learning. As for the efficiency of training and prediction, our scheme is 73578 and 12000 times slower than its plain version. Nevertheless, since the server is always powerful on computational capacity and the data providers only have to carry out trivial multiplications over R 6×6 , our scheme is practical in cloud environments. Moreover, if the accuracy is tolerable, we can shorten the ciphertext to make it more efficient.
In terms of communication overheads, encrypted data for training or prediction are 18 times larger than plain messages. In each iteration, the cloud should also exchange the ciphertexts of system parameters with two successive data providers, which are also 18 times of original Receives C t M t ←(C s t mod 2 l )/2 n coefficients. Considering that the expansion rate is not big and system parameters are quite limited, the communication burden causes just little performance degradation.

Conclusions
We presented a privacy-preserving machine learning method that works over multiple data providers in this paper. anks to the hardness of the conjugate search problem, data can be homomorphically processed for training or classification/regression under the same key. It is worth mentioning that we solved the intrinsic conflict between IND-CPA security and homomorphic comparision (without decryption), by specifically encoding the data which is allowed to be compared. To support training among multiple data providers, a key switching technology is also proposed based on the difficulty of the discrete logarithm problem and Lagrange theorem, which evaded the necessity of multikey homomorphic computation. Experiment illustrated that the accuracy of machine learning cannot be affected by the privacy capability of our scheme. e expansion rate of computation/communication complexity is small enough, which makes the scheme practical in cloud environments.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.