Practical mk-Anonymization for Collaborative Data Publishing without Trusted Third Party

1State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China 2Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China 3Department of Information Systems and Cyber Security, The University of Texas at San Antonio, San Antonio, TX 78249-0631, USA 4School of Cyberspace, Hangzhou Dianzi University, Hangzhou 310018, China 5Key Laboratory of Complex Systems Modeling and Simulation, Ministry of Education, China, Hangzhou Dianzi University, Hangzhou 310018, China


Introduction
In today's interconnected society, our sensitive personal data are increasingly stored in various databases belonging to different online service providers.Although online service providers have the duty and vested interest to ensure the security and privacy of user data, there are instances where user data are shared or compromised.For instance, a small to medium sized online service provider may wish to mine user purchasing patterns in order to fine-tune their marketing strategy and improve sales.Such (data mining) task is likely to be outsourced to a third-party marketing company; thus, the records in the online service provider's database will be shared with the third-party.In such a scenario, the online service provider requires a privacy-preserving data publishing (i.e., sharing) approach to ensure that the data is shared without breaching user privacy.
If the records to be published are owned by a single provider, the provider can easily run algorithms, such as [1,2] which implement -anonymity [3] (a widely used privacy protection mechanism), to anonymize the data prior to publishing.-anonymity is proposed to solve such problems, that is, when a data owner or provider wants to publish parts of its data which is related to some specific persons, how can it guarantee that these persons cannot be reidentified while the remaining parts of the data are still practically useful?Consider a table with  rows and  columns; each row of the table represents a record relating to a specific object and each column represents an attribute of each object.Some nonprivate attributes can be considered as quasi identifier (QI).A table satisfying -anonymity means QI of each tuple contained in this table appears at least  times [3].
However, data in the real world is unlikely to originate from a single provider.Solutions seeking to address such a scenario are known as collaborative privacy-preserving data publishing (CPPDP).CPPDP has received considerable attention in recent years (e.g., [4][5][6][7][8][9][10][11]).A straightforward solution is for all providers to outsource their data to a TTP, who will assume control of the data as if the TTP is publishing its own data.An alternative approach uses Secure Multiparty Computation (SMC) [12], which allows providers to collaboratively compute preferred functions upon the complete dataset without revealing private data [4,5].
Although these schemes could guarantee that anonymized data satisfies -anonymity against outsider attackers, malicious providers (i.e., insider attackers) or vendors who have access to the provider's systems may collude to invalidate -anonymity by excluding their own data.Let us now consider the example described in [13], where in Table 1,  1 ,  2 ,  3 ,  4 are databases of four hospitals.In this example, four hospitals wish to collaboratively publish their dataset without revealing their patients' privacy (e.g., medical diagnosis).Of the nonsensitive attributes ({, , }), {, } can be considered a quasi identifier (QI).QI refers to a set of attributes that are not unique identifiers in themselves but can be corrected to uniquely identify most tuples in the dataset [3]. *  is an anonymized dataset satisfying 2-anonymity.Each QI group includes 3 records; therefore, each QI tuple appears at least 2 times in every group.From {, }, we cannot infer the disease information of different patients.However, if  1 is an adversary, it could remove all its data from  *  .Thus, the only record in this group is provided by  3 , and the remaining data no longer satisfies 2-anonymity.Therefore, the attributes of disease can be achieved easily.For instance,  1 can link other datasets with the remaining part of first QI group, such as a voter list or another dataset which contains {, , , etc.}.This would enable us to infer the disease of Sara by linking two datasets together.In reality, there could exist more than one adversary. providers, for instance, might collude to remove their data to infer records contributed by other providers.This is known as the -adversary problem in the literature.
Seeking to address the -adversary problem, Goryczka et al. [13] introduce the concept of -privacy which is the focus of this paper.More specifically, we focus on -privacy with respect to -anonymity, which is referred to as -anonymity in the remainder of this paper.Suppose that the total number of providers participating in the collaborative data publishing is , and the published data after collaborative anonymization satisfies --anonymity if, and only if, subdata from any  −  providers satisfies -anonymity.When  adversaries remove all their data, the remaining data in the table still satisfies -anonymity.The corresponding anonymization process is the --anonymization, and as an example,  *  is an anonymized dataset satisfying 1-2anonymity.From the table, we can see that each QI group in  *  contains 3 records.Even if one of the providers is an adversary, that is,  1 is an adversary and it removes all its data from the first QI group, the remaining two records in the first QI group of  *  still satisfy 2-anonymity.Goryczka et al. present a TTP-dependent CPPDP scheme that achieves --anonymity.However, TTP does not always exist in the real world [14,15].This is especially true in aftermath Edward Snowden's revelation that the US Government has been conducting large-scale government surveillance (http://masssurveillance.info/).They then present a SMC variant of this scheme based on a series of cryptographic protocols (e.g., secure sum, secure comparison, and secure size of set union) to remove the need for a TTP.However, this variant is not specially designed for --anonymity, and the constituent cryptographic protocols are too time consuming to be practical for real-world deployment.
In this paper, we propose a TTP-independent CPPDP scheme designed to achieve --anonymity in a more efficient manner.We observe that the process of -anonymity involves no sensitive attributes, and hence, we divide our scheme into two phases.Firstly, we use a centralized server which is not required to be trusted to aggregate nonsensitive components (i.e., QI attributes) of records from each provider and anonymize these components to ensure --anonymity.Secondly, we design a distributed privacy-preserving method to aggregate values of sensitive attributes for each equivalence group without breaching --anonymity.In other words, we present a practical two-phase CPPDP scheme without the need for a TTP, before demonstrating that the proposed scheme achieves --anonymity in the widely accepted semihonest model.In the event that providers are malicious and attempt to modify user data, users are unlikely to find out about the tampering.Therefore, we present an effective sampling-based defense strategy against such an attack.We then evaluate the time efficiency of the proposed scheme with a public dataset including 45222 records.Our evaluations demonstrate that the time overhead increases linearly with , which is reasonable for an offline scheme.

Related Work
Recent trends in big data and cloud computing have partly contributed to renewed interest in privacy-preserving data publishing [16][17][18][19].Existing literature on privacy-preserving data publishing can be broadly classified into the following categories.
Single Provider.Most existing research focus on the scenario involving a single data owner wanting to publish its own data, such as -anonymity [3], -diversity [20], and -closeness [21].Of these privacy models, -anonymity has the longest history, and many efficient algorithms have been developed to implement -anonymity.Examples of a bottom-up generalization approach and a top-down specification approach is Incognito [2] and Mondrian [1], respectively.
Collaborative Data Publishing.In the collaborative data publishing literature, the focus is on privacy-preserving algorithms for distributed setups.For example, Jiang and Clifton [6] propose a protocol implementing -anonymity on a vertically partitioned dataset.The protocol presented in [7] is designed to extract anonymized data from a set of providers, which is then published to the miner.Jiang and Clifton [8] present a SMC framework for data sharing between two untrusted parties, and Jurczyk and Xiong [9] present several decentralized protocols to ensure the user's privacy during the querying of multiple databases.Mohammed et al. [10] seek to address the privacy-preserving problem in a specific application of data mashup in the web, which is a typical distributed scenario.The authors also present distributed algorithms to integrate healthcare data [11].The protocol presented in [5] allows protocol participants decide in advance whether its utility is acceptable prior to execution.
Insider Attackers.Goryczka et al. [13] present the -adversary problem, where data providers are considered as potential attackers.To address such a threat, they propose the privacy model and present an efficient and effective TTPbased anonymization scheme.A key limitation of the scheme is the need for a TTP, which is not always available in the real world.Therefore, a SMC-based variant which does not rely on TTP is proposed by Goryczka et al.However, as noted by the authors, the SMC scheme is only a conceptual scheme that is not practical for real-world deployment due to the significant time overhead of the underlying protocols.This is the gap we seek to address in this paper, by presenting a practical SMCbased -privacy implementation.

Problem Definition
Let  = { 1 ,  2 , . . .,   } be the set of all data providers, who own a set of records,   ., defined as  = { 1 ,  2 , . . .,   }, is the set of all records.Providers aim to collaboratively publish the dataset  while preventing attackers from identifying records of individuals.In such a distributed environment, the providers may not trust each other.In other words, none of the providers can be considered as a TTP, and the publisher (considered as a separate party) is also not a trusted party.
-Anonymity can prevent external adversaries from inferring sensitive attributes with QIs.However, -anonymity is unable to address the -adversary problem, as malicious providers (i.e., insiders) may collude to remove their own data to violate the -anonymity of the remaining data.Targeting this problem, we define a new privacy model which we coined --anonymity.The --anonymity model is adapted from the -privacy model of Goryczka et al. [13].We denote the set of records owned by all adversaries   = {  1 ,   2 , . . .,    } to be   ; that is,   = ⋃   ∈    .In other words,   (QI  ) is the set of those records owned by any adversary in group QI  .
Definition 2 (--anonymity).Given a set of  providers,  = { 1 ,  2 , . . .,   },  adversaries, and It means that, for every equivalence group in the anonymized dataset, its size excluding the number of records owned by any  providers must be larger than .In other words, after  colluding providers have removed their records, each group must contain more than  records.Design Goal.An efficient and practical CPPDP scheme providing --anonymity without involving a TTP is designed.

Sensitive
Phase II: secure aggregation two-phase scheme.In our scheme, we have two key assumptions (Assumptions 3 and 4).
Assumption 3. The adversaries are semihonest (i.e., honest but curious), who will faithfully follow the protocol.However, these adversaries will also try to infer user privacy based on the protocol interactions.
Assumption 4.There are at most  colluding providers.
4.2.Two-Phase Scheme.Our scheme, based on Observation 1, consists of two phases.In the first phase, all providers transmit only data with no private attributes to the untrusted publisher, who will carry out an algorithm implementing --anonymity on the received data (see Section 4.2.1).In the second phase,  + 1 randomly choose providers to collaboratively aggregate data with private attributes using a suitable cryptographic system (see Section 4.2.2).An illustration of the scheme is depicted in Figure 1.
Observation 5.The anonymization process for -anonymity does not involve private attributes.

Phase I: 𝑘-Anonymization with Insensitive Attributes.
In Phase I, the providers remove private attributes from the data prior to sending it to a third-party publisher that they may not truly trust.The publisher will then run a modified Mondrian algorithm on the received data to achieve -anonymity.Mondrian [1], one of the most efficient algorithms implementing -anonymity, models the dataset using a multidimensional space with each attribute contributing a dimension.-Anonymization in this multidimensional space is to recursively partition each subspace into two smaller subspaces which do not overlap with each other until the stop condition is satisfied.Each subspace represents a QI group.Based on this conclusion, we modify Mondrian so that the algorithm terminates when, in every subspace, a further partition will result in the number of records, owned by  providers who have most records, being greater than the total number minus .The output of this algorithm is a number of QIs subject to --anonymity.More details on the algorithm are demonstrated in Algorithm 1.Each iteration in Algorithm 1 is divided into   − 1 frequency sets according to dimension and values.Then for each frequency set, the split value will be calculated.According to   − 1 split values, the attributes can be divided into   subspaces.Each subspace is a group of quasi identifier.The algorithm returns QIs that consist of QI  implementer's choice, such as RSA.Every provider generates its public and private key pair, makes public the public key, and protects the private key.
The publisher will first send QIs obtained in Phase I to every provider.At the same time, the publisher randomly selects  + 1 decryption providers,   = {  0 ,   1 , . . .,    }, from  and sends their addresses to each provider.We denote the remaining providers as    (0 ≤  ≤  −  − 2).On receiving QIs, every provider iteratively assigns private attributes of every record to the group whose QI contains its value of every nonprivate attribute.The providers encrypt the private data and their group information, using  + 1 decryption providers' encryption keys one at a time, in the reverse order of the addresses they received.Note that the encryption scheme used needs to be probabilistic (i.e., basic requirement of a secure encryption scheme).In other words, the encryption scheme introduces randomness in the encryption so that the encryption of the same message will produce a different cipher text each time.
As illustrated in Figure 2, the chain consists of  providers.We choose   0 as the first decryption provider in the chain.All providers (  −−2 , . . .,   0 ,   0 , . . .,    ) will decrypt their data and send the encrypted data to the first decryption provider   0 .Upon receiving all encrypted data,   0 will decrypt and uniformly repermute the decrypted data, prior to sending the decrypted data to the next decryption provider (  0 , . . .,    ).All decryption providers in the chain repeat the same process, sequentially.When the last decryption provider obtains the partially decrypted data, it performs the (last) decryption and submits to the publisher.This repermutation process breaks the linkage between the data and their providers.
In our proposal, providers have to perform decryption operations, which would inevitably bring additional communication and computation overheads.However, considering the fact that the data publishing is usually performed offline, we think higher overheads are affordable so long as they are still within reasonable bounds (e.g., a couple of hours or days).The detailed complexity analysis of our proposal is shown in Section 4.3.Our experiment results on real-world datasets in Section 6 also show that the time overheads are within the acceptable range.

Security Analysis.
We now present the security proof for our proposed scheme based on Theorem 7.
Theorem 7. The two-phase scheme can correctly implement --anonymity in the semihonest model (described in Section 4.1). Proof.
Phase I.It is trivial to observe that data privacy will not be compromised, since the providers transmit only data with no private attributes.
Phase II.Since at most  providers are malicious and there are  + 1 or more decryption providers, there must exist at least one honest decryption provider in the decryption chain (which is the worse-case scenario).
Case 1. Adversaries are in front of honest decryption providers in the decryption chain.The adversaries are not able to fully decrypt the cipher texts as they do not have all decryption keys, although knowing the cipher text produced by another provider can be useful in inferring a record's private attribute.
Case 2. Adversaries are after honest decryption providers in the decryption chain.In this case, the adversaries could collaborate to obtain the plain text of the private data, but they are unable to map the data to their owners due to mix operations of the (one or more) honest provider(s).Case 3. Adversaries are both before and after honest decryption providers in the decryption chain (e.g., V 1 , V 2 ,. . ., V  , V 1 , V +1 , eV 2 , V +2 , . ..).In this case, before data pass the honest providers, adversaries will not be able to collaborate to fully decrypt the data as they do not have all decryption keys.After the data has passed the honest provider(s), the adversaries could collaborate to decrypt the records.However, since the data has been repermuted by the honest provider(s), the adversaries will not be able to link the records to their owners.Therefore, the proposed scheme is still secure under this case.
Complexity Analysis.Phase I: the complexity of the modified Mondrian algorithm is ( log ), similar to the performance in the original Mondrian algorithm, where  is the total number of records.As the providers submit their insensitive data directly to the publisher, the communication complexity of Phase I is ().
Phase II: the major computations are the encryption and decryption of private attributes, but it is easy to find that every record involves +1 encryptions and +1 decryptions.Thus, the computation complexity of Phase II is ().Since an encrypted record is transmitted for up to  + 2 times, the communication complexity of Phase II is also ().

Discussion.
In this part, we will discuss the reliability of our scheme and compare it with TTP-scheme proposed by Goryczka et al. [13].
Goryczka et al. propose an anonymization algorithm based on the Binary Space Partitioning.This algorithm can be implemented in a distributed environment by a trusted third party (TTP), which is considered as a secure anonymization protocol.It consists of two subprotocols.This first one is the provider-aware anonymization protocol.The time complexity of this protocol is determined by the number of records  and the number of attributes ||.The analysis on provideraware anonymization protocol shows that its time complexity equals (( + 1)( 2 +   )), where  is the number of providers and   is the maximal number of fake values.The second one is the secure fitness score protocol and its time complexity is ( 2 +   ).
However, it is not easy to find a trusted third party in the real world.So, our scheme is TTP-independent which can achieve --anonymity in a more practical manner.Since we assume that there are at most -adversaries and the total number of providers is more than  + 1, there must exist at least one honest provider.We have proved the security of this proposal in Section 4.3.Our scheme is divided into two phases.The time complexity of Phase I is ( log ).Phase II's time complexity equals ().The time complexity of our scheme increases linear with , where  is the number of providers, which is reasonable for an offline scheme.According to these facts, we think that our scheme is more secure in the real-world since it does not rely on any trusted third-party.What is more, the increased computation overheads due to encryptions and decryptions are within acceptable range, which have been demonstrated by our experiments on real datasets.

Fully Malicious
Model.The semihonest model, while widely accepted in the literature, may not be practical in the real world.More specifically, in the semihonest model, we are trusting providers not to misbehave.For example, Choo [22,23] remarked that "there are legitimate concerns about cloud service providers being compelled to hand over user data that reside in the cloud to government agencies without the user's knowledge or consent due to territorial jurisdiction by a foreign government."Similar concerns were raised in [24], which then presented an extended proxy-assisted approach to address the concern of the need to trust the cloud server not to disclose user's proxy keys which is inherent in proxy/mediator assisted user revocation approaches.Therefore, in this section, we will present a fully malicious model, which does not require an adversary to follow the protocol.In fact, the adversary's aim is to successfully compromise user privacy.In the remainder of this paper, we will focus on tampering attacks that can be undertaken by (malicious) decryption providers.More specifically, if a decryption provider is malicious, the provider can replace encrypted data belonging to one or more honest providers with fictitious or fabricated data.Consequently, in the published result, the adversary(ies) can remove its/their original data and the fictitious or fabricated data inserted by the decryption provider from a number of QI groups.Hence, these groups will contain fewer than  records.

Sampling-Based Extension.
The sampling-based extension of our scheme is described as follows.
From Section 4.2.2, we know that every provider encrypts the private data and their group information using  + 1 decryption providers' public keys one at a time.After encrypting the private data with the respective public keys, every provider generates  + 1 special strings, SS   (0 ≤  ≤ ) of   , and sends SS   to the decryption provider    .These strings are special because they do not have any group information.Then, each provider   (1 ≤  ≤ ) adds  pieces of every SS   (0 ≤  ≤ ) encrypted with {   ,   −1 , . . .,   0 }'s public keys, respectively, and mixes them with the encrypted private data.
Once    has finished the necessary decryption,  *  special strings should have been fully decrypted.   can easily distinguish these fully decrypted strings from previously obtained information.Therefore, these special strings can be removed.Only when the number of every SS   (1 ≤  ≤ ) equals  could the decryption provider sends the remaining decrypted data to the next decryption provider; otherwise, the decryption provider must discard all data and inform the publisher and the other providers.We suppose adversaries remove each record with probability ; then every provider's detection rate is 1 − (1 − )  .For a specific dataset, different  correspond to different detection rate.So we can set a threshold; if the detection rate is greater than the threshold, then we choose the value of  as the number of special strings to be added by each provider.For example, in Section 6 we can choose  = 300 when the detection rate is greater than 95%.

Worst-Case Scenario Analysis.
There is only one honest decryption provider in the decryption chain.Adversaries appearing after the honest decryption provider in the decryption chain cannot tell the data's owners even if they fully decrypt the cipher texts because the honest provider has faithfully repermuted the data which will break the relation between the cipher texts and their providers.Adversaries appearing before the honest provider in the decryption chain have access to the information which identifies the provider.However, these adversaries will not be able to fully decrypt all data except the special strings for validation.Therefore, regardless of the method used, the adversaries could probably remove some special strings which can be detected by the honest decryption provider.This simple method fulfills our design goal (i.e., an efficient and practical CPPDP scheme providing --anonymity without involving a TTP), without compromising on the quality of the result.Although the additional data do not have group information, it can be easily distinguished and removed from the result.
Suppose that adversaries appearing before the honest provider in the decryption chain remove each private record of   independently with probability ; then the detection rate is 1 − (1 − )  by   .By inserting additional validation strings, we will achieve a higher detection rate.However, this will result in a longer processing time.

Experiment Setup and Findings
We now describe our experimental setup and findings.
6.1.Setup.We performed the experiments on several machines, each with 2.4 GHz Xeon E5 CPU and 2 G RAM.The operating systems are Ubuntu 12.04 and the implementation was built and run in Java 2 Platform Standard Edition 7.0.We used the Adult dataset (http://archive.ics.uci.edu/ml/datasets/Adult),which is a commonly used benchmark in the literature [25][26][27].We combined the training and test sets in the Adult dataset and removed records with missing attributes.Thus, we ended up with a dataset of 45222 records with 14 attributes.We assigned {age, workclass, fnlwgt, education, education-num, marital-status, relationship, race, sex, capital-gain, capital-loss, hours-per-week, nativecountry, salary} as the QI and occupation as a private attribute.The total datasets were uniformly distributed among  data providers so that each of them was assigned a subset of similar size.We implemented RSA due to its popularity in commercial applications; each provider generates 1024-bit keys.We considered time as the key efficiency metric in our evaluations.

Findings.
We measured the time between the publisher executing the scheme and receiving the resulting dataset.Findings are illustrated in Figure 3.In Figure 3(a),  * 3000 records were chosen from the original dataset and uniformly distributed among  providers; thus, each provider had 3000 records on average.In Figure 3(b), all 45222 records were uniformly distributed among  = 10 providers.
According to the definition of --anonymity, there must exist at least one honest provider among  providers.Thus,  is always greater or equal to  + 1.It can be seen from Figure 3(a) that the execution time is approximately linear to , which represents the size of the complete dataset.The result is also consistent with our guess.Even in the event that there are 15 providers, the time cost is below 20 minutes when  = 2.When  = 15 and  = 5, the time cost is below 35 minutes.This is sufficiently efficient since data publishing is usually performed offline.In addition, from Figure 3(b), we can see that the execution time increases a little faster with .This is easy to understand since more colluding providers indicate more encryptions and decryptions for each record and these cryptographic operations are very time consuming.Fortunately, the increased rate is still acceptable since the total number of encryptions and decryptions increases linearly with .If we assume that fewer than half of the providers may collude, the time cost is around one hour, which is reasonable for an offline algorithm.
According to the literature [13], the runtime of TTPscheme is very high.The computation time of TTP-scheme increases almost exponentially with  which represents the number of providers.So doing experiment on secure privacy anonymization which is a subprotocol in TTPscheme to achieve the computation time is unrealistic.Hence, we take the same approach as the authors mentioned in [13] to estimate the magnitude of computation time on the same dataset.The result is shown in Figure 4. Figure 4(a) shows the estimated time with varying values of .In Figure 4(a), we can see that the computation time increases exponentially with .And Figure 4(b) describes the estimated time varying different .Due to the provider-aware anonymization protocol in TTP-scheme, when the number of adversaries increases, the computation time decreases exponentially.Provider-aware means that providers will be aware if there exist one or more adversaries among all providers.Thus, as the number of adversaries increases, providers will discover adversaries earlier.So the increasing  will cause the anonymization process to end earlier.In terms of TTP-scheme implementation, the secure protocols can be chosen as different algorithms, such  as  − , .The choice of the algorithm will not adversely affect the result of computation time.
We also remark that the security of the TTP-based scheme can be guaranteed, in the sense that the m-privacy anonymization protocol in TTP-scheme is secure as long as the subprotocol in the scheme is secure.We refer interested reader to [13] for the detailed security proof.Both TTPscheme and our TTP-independent scheme can achieve mprivacy and -anonymity.However, our algorithm is more efficient and practical.Due to the use of a strong encryption/decryption algorithm, our scheme is more secure in practice.The execution time of our scheme on the real dataset demonstrates that it is practical for deployment, especially for an offline algorithm.
The following experiments illustrate the number of special strings we need to insert in order to obtain an ideal detection rate.We ran our extended scheme 1000 times under different settings (e.g., different adversary discarding rates and numbers of special strings).Suppose that the adversaries appearing before the honest decryption provider in the decryption chain collaboratively remove each record independently with probability .It can be seen in Figures 5(a) and 5(b) that, when adversaries remove each record with probability 1% independently, inserting 300 special strings yields a detection rate larger than 95%.For our experimental settings, 300 is about 1/15 of the size of each provider's dataset, and the extension will increase the execution time by about 1/15 as the total time is linear to the number of records.In other words, the number of validation strings to be inserted is determined by the desired detection rate and discarding rate of the context.

Conclusion
In this paper, we studied the -adversary problem, where  (geographically dispersed) providers could collude.Existing solutions either depend on a trusted third party (TTP) or have impractical time overheads.In our proposed two-phase scheme, however, we demonstrated how our scheme can be used to implement --anonymity without the need for a TTP.We also proved the security of the scheme in a semihonest adversary model.We then explained how our scheme can be extended so that it is also secure in a stronger adversary model.Lastly, our experiments demonstrated the practicality of our scheme to be deployed in a real-world context.
Future research include extending the scheme to provide -privacy with respect to other privacy constraints and generalize the scheme to implement -privacy with respect to -anonymity on distributed incremental datasets or collaborative data republishing.
Figure 5(a) shows the detection rate under different  (the number of inserted validation strings) when  = 0.01 and Figure 5(b) shows the detection rates to different discarding rates when  = 300.
1, QI 2 , ..., QI   .4.2.2.Phase II: Private Data Aggregation.In order to ensure the security of the private data, Phase II of our scheme uses a secure public-key cryptographic algorithm of the scheme (b) Estimated execution time versus  ( is the number of adversaries)