^{1}

^{2}

^{3}

^{3}

^{2}

^{1}

^{2}

^{3}

To determine whether images on the crowdsourcing server meet the mobile user’s requirement, an auditing protocol is desired to check these images. However, before paying for images, the mobile user typically cannot download them for checking. Moreover, since mobiles are usually low-power devices and the crowdsourcing server has to handle a large number of mobile users, the auditing protocol should be lightweight. To address the above security and efficiency issues, we propose a novel noninteractive lightweight privacy-preserving auditing protocol on images in mobile crowdsourcing networks, called NLPAS. Since NLPAS allows the mobile user to check images on the crowdsourcing server without downloading them, the newly designed protocol can provide privacy protection for these images. At the same time, NLPAS uses the binary convolutional neural network for extracting features from images and designs a novel privacy-preserving Hamming distance computation algorithm for determining whether these images on the crowdsourcing server meet the mobile user’s requirement. Since these two techniques are both lightweight, NLPAS can audit images on the crowdsourcing server in a privacy-preserving manner while still enjoying high efficiency. Experimental results show that NLPAS is feasible for real-world applications.

Recently, mobile crowdsourcing systems have been widely deployed all over the world, which collect and process data through widely available mobile devices [

Regardless of the technology implemented, a typical “noninteractive lightweight privacy-preserving auditing system (NLPAS)” includes three entities: the “crowdsourcing server (CS)” which stores images, the “mobile user (MU)” who audits images stored on the crowdsourcing server before downloading them, and the worker who collects images and uploads them to the CS. In practice, these entities are involved in two processes (i.e., the uploading process and the auditing process). During the uploading process, the worker collects images and uploads them to the CS. During the auditing process, the MU audits images stored on the CS and then determines whether to download them.

Security has vital significance for NLPAS. To avoid economic loss, the CS is not willing to transport images to the MU before the latter pays for them. On the other hand, the MU is not willing to pay for images before he/she can make sure that these images really meet the requirement. To handle this dilemma, it is reasonable to design a privacy-preserving auditing protocol, which allows the MU to check whether these images meet the requirement before downloading them. Unfortunately, the current security protocols for crowdsourcing systems (i.e., [

Efficiency is another serious concern for NLPAS. Due to limited resources of mobile devices, the MU is seriously concerned about the high computation cost arising from running the auditing protocol. At the same time, the CS will have to handle a lot of auditing requests from multiple MUs, and it is seriously concerned about the computation cost too. So, the newly designed auditing protocol should be lightweight. Taking both security and efficiency into account, we aim to design a noninteractive privacy-preserving lightweight auditing protocol on images in the mobile crowdsourcing system, which extracts features from images and then determines whether these features meet the MU’s requirement. An auditing protocol for the mobile crowdsourcing system should fulfill the following requirements:

Obviously, designing an auditing protocol for NLPAS is a nontrivial task, as the MU has to determine whether the images stored on the CS meet the requirement without downloading them. Recently, security protocols for mobile crowdsourcing systems have focused on image-checking techniques run by the CS. However, there is no protocol considering the image-checking technique established by the MU. Moreover, when focusing on this topic, we notice that there is no security scheme which can be directly used for satisfying all the above requirements. We will present the detailed analysis for arriving at this conclusion in the next section. This becomes a more serious issue, since more and more mobile crowdsourcing systems are being deployed. Motivated by this observation, in this paper, we mainly make three contributions:

We present a comprehensive set of requirements for auditing protocols in mobile crowdsourcing systems and show some security and efficiency problems of current data checking protocols in mobile crowdsourcing systems.

We propose a novel auditing protocol called NLPAS, which can check whether the images stored on the CS meet the MU’s requirement. However, different from current data checking protocols for mobile crowdsourcing systems, NLPAS allows the MU instead of the CS to check images in a privacy-preserving manner. By doing so, the MU can determine whether the images meet the requirement before paying for them. At the same time, the CS can make sure that those images will not be leaked. To fulfill all these requirements, we will first introduce the “binary convolutional neural network (BCNN)” technique [

We analyze the security of NLPAS, showing it satisfies requirements (1), (2), and (3) in Section

Data trust is an essential security problem in mobile crowdsourcing systems. Due to openness, workers in mobile crowdsourcing systems may have different security abilities, resulting in low data trust [

This sort of scheme takes observed results with the most observers as the true data [

This sort of scheme uses context information such as location information to determine whether the data are true or not. For example, in [

This sort of scheme [

This sort of scheme uses a gold data set for checking data uploaded by workers [

This sort of scheme aims to address data redundancy issues [

From the above analysis, it can be seen that the existing schemes mainly focus on data checking performed by the CS and workers. And, they mainly consider whether the data are correct. However, existing schemes did not allow the mobile users to check data before downloading it. Moreover, existing schemes only focus on the correctness of data and do not consider whether the data match the MU’s requirement since requirements from multiple mobile users may vary. This leads to a serious issue: The mobile user may waste money on nonmatched data. Therefore, an auditing scheme performed by the MU before data downloading is desired.

Data privacy is another important problem in mobile crowdsourcing systems. First, if the data uploaded by workers are leaked, the CS and workers may lose money. More importantly, if the leaked data contain privacy of workers, the adversary may cause them harm. Second, if the data of mobile users are leaked, the adversary may deduce valuable information about mobile users [

This sort of scheme aims to encrypt data before uploading them [

This sort of scheme adds perturbation to data [

This sort of scheme aims to protect location information of workers [

This sort of scheme aims to protect personal information of workers and mobile users [

Recently, several new techniques such as blockchain and fog computing have been introduced to mobile crowdsourcing networks [

Features of existing schemes.

Data trust | Privacy preserving | Noninteractive | |
---|---|---|---|

[ | Voting-based data checking | ╳ | ╳ |

[ | Context information-based data checking | ╳ | ╳ |

[ | Statistics-based data checking | ╳ | ╳ |

[ | Gold data set-based data checking | ╳ | ╳ |

[ | Data redundancy checking | ╳ | ╳ |

[ | ╳ | Encryption | ╳ |

[ | ╳ | Differential privacy | ╳ |

[ | ╳ | Location privacy | ╳ |

[ | ╳ | Personal information privacy | ╳ |

[ | Blockchain-based authorization | ╳ | ╳ |

[ | ╳ | Blockchain-based data privacy | ╳ |

[ | ╳ | Fog-based data privacy | ╳ |

From Table

Furthermore, for images, this dilemma becomes more serious since the data volume of images is much larger than that of traditional texts. To handle this dilemma, it is desired to design a noninteractive lightweight privacy-preserving auditing protocol on images in mobile crowdsourcing networks, which allows the MU to efficiently determine whether the images meet the requirements without knowing anything about these images.

A convolutional neural network [

Similarly, each

For bitwise

Moreover, since the

The Hamming distance [

Given two

The above definition shows that the Hamming distance is the total number of different bits between

The main purpose of NLPAS is to determine whether the image on the CS meets the requirement of the MU in a privacy-preserving manner.

The main idea of NLPAS can be divided into two parts, namely, the feature extracting part and the Hamming distance computation part. For the feature extracting part, the MU first defines a binary convolutional neural network and trains it using a data set according to the user’s requirement. Then, the MU extracts a binary vector (

For the Hamming distance computation part, the MU and the CS hide the two input vectors (

Based on the feature extracting and Hamming distance computation techniques, the MU compares the Hamming distance of

The system model of NLPAS is shown in Figure

System model of NLPAS.

Notations in this paper.

Notation | Description |
---|---|

Binary convolutional neural network model trained by the MU | |

Private parameters of the MU | |

Public parameters of NLPAS | |

Feature vector of the MU | |

Feature vector of the CS | |

Length of the two binary vectors | |

Security strength of NLPAS | |

Ciphertexts generated by the MU | |

The first set of random numbers that | |

The second set of random numbers that | |

Ciphertexts generated by the CS | |

Hamming distance of | |

The first set of random numbers that | |

The second set of random numbers that | |

Threshold for determining whether the image on the CS meets the MU’s requirement | |

Prime number for counting different bits in | |

Prime number used as a carrier | |

The first set of random numbers for hiding | |

The second set of random numbers for hiding | |

The third set of random numbers for hiding | |

The fourth set of random numbers for hiding | |

The first set of bases for hiding vectors | |

The second set of bases for hiding vectors | |

Transitional values for extracting the hamming distance | |

Values that contain the hamming distance |

During this phase, the MU defines a binary convolutional neural network model (

Then, the MU sends the public parameter (

After the initialization phase, the MU holds (

When the MU wants to compute the Hamming distance

Upon receiving the ciphertext (i.e.,

After the hiding phase, the MU gets

After receiving the updated ciphertext (i.e.,

After the extracting phase, the MU gets the Hamming distance of

In the above system model, the MU’s vector

In the above system model, the CS’s vector

The construction of NLPAS is a tuple (

Construction of NLPAS.

In the above construction, NLPAS uses only a few simple mathematical operations (i.e., addition, subtraction, multiplication, division, and modulo operations) instead of time-consuming cryptographic operations such as modular exponentiation. Therefore, it enjoys high efficiency. We will further evaluate the efficiency of NLPAS in Section

In this section, we first show that NLPAS is correct and then analyze the security of NLPAS according to the security requirements described in Section

In the construction in Section

In this section, we shall show that

We start analyzing the meaning of

Second, taking the value of

Third, taking the value of

Fourth, considering all the four conditions (i.e.,

Fifth, since the length of

Sixth, since the length of

Finally, we get

This is really the total number of bits where

From the above discussion, we can see that the main idea of privacy-preserving Hamming distance computation includes two points. First, we hide the information of

The privacy-preserving requirement is to ensure that the adversary cannot extract

We first consider the privacy-preserving requirement for

From the

Moreover, since the length of

Furthermore, if the length of

We then consider the privacy-preserving requirement for

From the

Moreover, assuming the MU is the adversary who wants to extract

From the above discussion, it can be seen that the adversary cannot extract

The content privacy requirement is to ensure that the adversary cannot extract content of the interested image stored on the CS from

The auditing requirement is to ensure that the MU can determine whether the content in the image stored on the CS meets the MU’s requirement. This is ensured by the Hamming distance. If the Hamming distance of

As shown in Section

To provide a benchmark of efficiency evaluation, we used the MNIST data set [

MNIST [

LeNet [

For implementation, we used the BMXnet [

Accuracy comparison.

Accuracy | Model size | |
---|---|---|

Binary LeNet | 0.97 | 0.2 MB |

Full-precision LeNet | 0.99 | 4.6 MB |

From Table

The accuracy of the binary LeNet is slightly lower than that of the full-precision LeNet. The accuracy reduced by using the binary LeNet is around

The model size of the binary LeNet is much lower than that of the full-precision LeNet. The memory saved by binary LeNet is around

In other words, by using the binary convolutional neural network instead of the traditional full-precision convolutional neural network, the accuracy is only slightly reduced, but the memory is largely saved. Therefore, the binary convolutional neural network is quite suitable for the mobile crowdsourcing network, where mobile devices are with limited storage resources. The above evaluation shows that NLPAS fulfills the fifth requirement listed in Section

The computation cost of NLPAS includes the time cost consumed by the binary LeNet model and those consumed by mathematical operations. To test these time costs, we conducted our experiment on a laptop with an Intel i7-4770hq processor and an ubuntu-18.04 operating system. Then, we used OPENSSL [

For the binary LeNet, we take the features extracted by the last full-connection layer as the input vectors (i.e.,

After the initial settings, we can count the mathematical operations in the hiding and extracting phases as listed in Table

Number of mathematical operations in NLPAS.

84 | 166 | 0 | |

0 | 84 | 0 | |

0 | 0 | 2 | |

0 | 0 | 2 | |

0 | 0 | 2 | |

0 | 0 | 2 | |

0 | 0 | 2 |

Then, we tested the time costs consumed by these mathematical operations on the above laptop, and the average results of running them for 1,000,000 times are shown in Table

Time costs of mathematical operations (unit:

0.22 | |
---|---|

1.69 | |

2.19 | |

0.98 | |

1.03 | |

0.16 | |

0.14 |

Taking the results in Table

Computation costs of algorithms in NLPAS (unit:

18.48 | 178.48 | 9.00 |

The time costs of the binary LeNet and the full-precision LeNet are shown in Table

Computation costs of algorithms in NLPAS (unit:

Binary LeNet | Full-precision LeNet |
---|---|

56.6 | 1435.2 |

The above evaluation shows that NLPAS fulfills the fourth requirement listed in Section

To make sure that NLPAS can work well, we implemented it. In our experimental environment, there were one laptop and one computer. The laptop acts as the MU, and the computer acts as the CS. The result shows that the total running time in the auditing protocol is approximately 0.3 ms. Therefore, NLPAS is feasible for being deployed in the real world.

In this paper, we have proposed a noninteractive lightweight privacy-preserving auditing protocol on images in mobile crowdsourcing networks called NLPAS. NLPAS allows the mobile user to audit images stored on the crowdsourcing server without downloading them. Moreover, to achieve high efficiency, this paper introduced the binary convolutional neural network technique to the newly proposed auditing protocol and designed a novel privacy-preserving Hamming distance computation algorithm using basic mathematical operations. Experimental results show that NLPAS is feasible for real-world applications.

In this paper, we mainly focused on the privacy-preserving issue of the newly designed auditing protocol for mobile crowdsourcing networks. However, several more issues are to be addressed in the future. First, NLPAS does not consider the integrity of transmitted messages. Therefore, a new security protocol is needed to prevent these messages from being tampered by adversaries. Second, NLPAS used the binary convolutional neural network for extracting a binary vector from images. However, in many scenarios, feature vectors may be extracted using full-precision neural networks, which are not binarized. Therefore, a new technique is needed to convert the full-precision feature vector to a binarized one. To address these issues, future works are needed.

The data used to support the findings of this study are available at

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This paper was supported by the NSFC (nos. 71402070 and 61101088), the NSF of Jiangsu Province (no. BK20161099), and the Jiangsu Provincial Key Laboratory of Computer Network Technology.