Remote Attestation on Behavioral Traces for Crowd Quality Control Based on Trusted Platform Module

Behavioral traces of workers have emerged as a new evidence to check the quality of their produced outputs in crowd computing. Whether the evidence is trustworthy or not is a key problem during the process. Challenges will be encountered in addressing this issue, because the evidence comes from unknown or adversarial workers. In this study, we proposed an alternative approach to ensure trustworthy evidence through a hardware-based remote attestation to bridge the gap.*e integrity of the evidence was used as the trustworthy criterion. Trusted PlatformModule (TPM) was considered the trusted anchor inspired by trusted computing to avoid unreliable or malicious workers. *e module carefully recorded and stored many workers’ behavioral traces in the storage measurement log (SML). Each item in the log was extended to a platform configuration register (PCR) by the occurrence sequence of each event. *e PCR was a tamper-proof storage inside the TPM. *e value of the PCR was also considered evidence together with the SML.*e evidence was sent to the crowdsourcing platformwith the TPM signature.*e platform checked the integrity of the evidence by a series of operations, such as validating the signature and recomputing the SML hash. *is process was designed as a remote attestation protocol. *e effectiveness, efficiency, and security of the protocol were verified theoretically and through experiments based on the open dataset, WebCrowd25K, and custom dataset. Results show that the proposed method is an alternative solution for ensuring the integrity of behavioral traces.


Introduction
In China, crowdsourcing has rapidly progressed in various fields in the past years. Zhubajie (http://www.zbj. com) has established itself as a crowdsourcing leader with more than 22 million active workers. It covers a range of online and offline services, including tutoring and logo and product designs. Didi Chuxing (DiDi) (https://www. didiglobal.com) is another representative example of crowdsourcing platform. DiDi is China's leading mobile transportation platform that provides a full range of appbased transportation services, including taxi, express, premier, deluxe, bus, designated driving, enterprise solutions, bike sharing, e-bike sharing, automobile solutions, and food delivery, for many people. Tens of millions of drivers find flexible work opportunities on the DiDi platform.
is platform provides more than 10 billion passenger trips annually. Crowdsourcing has become a fast, convenient, and cost-effective mode of research and production to obtain flexible and cheap resources. Many organizations flexibly outsource work, such as collaborative sensing [1,2] and human-powered online security [3,4], to a global pool or workers on a temporary basis. However, quality control remains a challenge in crowdsourcing systems, because the crowd is typically composed of people with unknown and extremely diverse abilities, skills, interests, personal objectives, and technological resources.
In crowdsourcing systems, tasks are posted to a crowdsourcing platform and are solved by a large group of workers with diverse characteristics. Amazon Mechanical Turk (MTurk) [5] is a successful crowdsourcing system that enables individuals or companies to harness collective intelligence from a global workforce to accomplish various tasks, such as human intelligence tasks (HITs). Employers (known as requesters) recruit employees (known as workers) to execute HITs, evaluate the outputs produced by workers, and reward them depending on the quality. e produced outputs may be directly and automatically checked by the crowdsourcing platform when requesters delegate responsibility for quality control. Rewards may be monetary or nonmonetary, such as gifts and reputation.
Quality control is important and challenging in crowdsourcing, because of the heterogeneous nature of workers [6,7]. Output quality depends on the profiles and abilities of workers, description of tasks, incentives provided, processes implemented to detect malicious behavior or lowquality data, and the reputation and collaboration by the requester. erefore, output quality is related to many dimensions. In the current study, only workers' dimension is considered. Although most workers perform tasks in good faith and are competent to deliver quality outputs, not all outputs are of high quality. Ensuring the quality of outputs remains challenging, and many algorithms based on workers have been proposed for quality control. Gold data are widely applied in this scenario. One approach is to adopt a qualification test, in which a worker is given questions where the answers are already known, to determine his or her performance. e famous crowd platform, CrowdFlower, uses this method to estimate the quality of workers [8]. Another approach is to mix gold tasks into tasks that are assigned to workers. Different from the former, workers do not know that this is the golden task, and they do not perform a test at their first arrival. e two approaches require the ground truth of a subset of tasks to be known in advance. Many problems are found in using these methods. How better and faster can we estimate the quality of workers when we use gold data? What is the improvement rate of quality estimation with the addition of many gold data? Generating high-quality gold data is expensive. ese methods may be costly, because they are additional operations in crowdsourcing.
A recent tendency is by leveraging workers' behavioral traces to estimate the quality of outputs. e novel method attempts to capture the process that workers use to perform tasks by a series of events. is approach has advantages over the aforementioned methods. First, the monitoring program is almost invisible to the workers, because it runs in the background. erefore, fraudsters who attempt to evade the anticheating checks will be foiled. Second, the time and cost for conducting experiments can be reduced because the monitoring will not introduce extra or redundant questions. Existing methods based on workers' behavior traces have mainly focused on what metrics should be collected and detection methods based on metrics. Limited researches have been conducted on trustworthy behavioral traces that are preconditions in making correct decisions.
In this study, we propose an alternative approach to ensure trustworthy behavioral traces with trusted computing for bridging the gap. In particular, integrity is considered the trust criterion. In the current study, these terms, integrity, trust, and authenticity have no difference. A Trusted Platform Module (TPM) embedded on the motherboard of a computer is considered the trusted anchor to avoid human factors. It faithfully records and reports workers' behavior. e behavioral traces are stored in a storage measurement log (SML). Each behavioral trace is extended to a tamperproof register called platform configuration register (PCR), by its occurrence sequence. e PCR value is considered the evidence combined with the SML. When reporting the evidence, the TPM makes the signature using its private key to ensure the integrity of behavioral traces. e crowdsourcing platform can check the assurance through a series of operations, such as validating the signature and recomputing the SML hash. is process is designed as a secure remote attestation protocol (SRAP) to ensure that it is done on the basis of the expected behavior.
is study investigates a fundamental problem to ensure the integrity of the evidence in crowdsourcing quality control on the basis of workers' behavioral traces. e principal contributions of this work are summarized as follows: (1) e current problem is formulated as a remote attestation protocol (RAP) on the basis of assumption, threat model, and attack types. e security of the protocol is defined on the basis of an adversary experiment. (2) A specific SRAP, SRAP-I, is designed between a worker and a crowdsourcing platform. e protocol inventively utilizes one TPM binding a physical computer as the trust anchor to avoid attacks from malicious workers or malware. To the best of our knowledge, the current paper is the first to solve the issue on ensuring the integrity of behavioral traces of workers in crowdsourcing quality control. e remainder of this paper is organized as follows. Section 2 explores the related work. Section 3 formulates the current problem. Section 4 describes the specific SRAP, SRAP-I, in detail and presents the mathematical proof of its security. Section 5 discusses the ability to resist attacks. Section 6 conducts the experimental evaluation. Section 7 provides the conclusions.

Related Work
e combination of humans and computers to accomplish tasks that cannot be performed alone has attracted considerable attention from academic and industrial circles [9].
is idea dates back to the 1960s, with the publication of "Man-Computer Symbiosis" by Likelier [10]. Tim Berners-Lee proposed the concept of a social machine in 2009 and regarded the cooperation between machines and humans as the next direction of web application development [11]. e term "crowdsourcing" was coined by Jeff Howe in 2006 [12]. MTurk is a pioneering crowdsourcing system that provides on-demand access to task forces for microtasks that are easy for humans but remain difficult for computers. e outputs produced by crowd workers include many noises. is condition is because of many aspects. First, these workers might have different skill levels that are sometimes insufficient to complete the tasks. Second, they might have various and biased interests and incentives. Finally, malicious workers who intentionally provide incorrect answers are found. Many other factors, such as task description and data quality, affect the quality of outputs. Quality control has been widely investigated because of these features with the emergence of new technologies.
Substantial research efforts have been exerted to develop methods for detecting and correcting low-quality work to improve the overall quality of the resulting data. e stateof-the-art quality control can be categorized into three classes, namely, individual, group, and computation-based, in accordance with the literature [6]. e current study mainly focuses on fingerprinting, which is a computationbased method. is method captures behavioral traces from workers during task execution and uses them to predict the quality, errors, and likelihood of cheating. Rzeszotarski and Kittur first coined a method where the interactions of a crowd worker with a task interface, such as clicks, scrolls, and key presses, are logged and then correlated with his/her accuracy [13]. ey demonstrated the effectiveness of the approach in predicting output quality. As an extension, the authors in [14] proposed a system called CrowdScape that supports the human evaluation of complex crowd work through interactive visualization and mixed initiative machine learning. Heymann and Molina presented a novel analytic tool called "Turkalytics" for human computation systems. Turkalytics processes and reports logging events from workers in real time and has been shown to scale to more than 100,000 logging events per day [15]. Kazai and Zitouni collected behavioral data from crowd workers and experts through search relevance judging. ey then trained a classifier to detect poor quality work on the basis of behavioral data. ey concluded that accuracy is approximately doubled in several tasks with the use of gold behavior data [16]. Dang et al. built a framework called MmmTurkey by leveraging the concept of tracking worker activity [17]. ey collected and shared a new MTurk dataset of behavioral signals in judging the relevance of search results on the basis of their framework [18]. ey concluded that behavioral data can be effectively used to predict work quality using their prediction model. Gadiraju et al. studied the relation between behavioral data and performance of workers in microtasks.
ey demonstrated that behavioral traces are effective in selecting workers using a novel model [19]. Mok et al. applied a method based on the behavioral data of workers in Quality of Experience to detect low-quality workers [20]. Although trustworthy behavioral traces are the preconditions for the abovementioned method, this point is not apparently stated. To the best of our knowledge, no direct research is reported on this issue.
To bridge the gap, this study first focuses on the issue and proposes an alternative solution inspired by trusted computing. Trusted computing aims to develop technologies that ensure users about the behavior of the software running on their devices. In particular, a device can be trusted when it consistently behaves in the expected manner for the intended purpose [21]. Although software-based trusted computing architecture with interesting results has been proposed [22], it can only be used in limited settings and cannot provide the same security guarantees as hardwarebased architectures. An important part of trusted computing is to protect against attackers for gaining full control over the system; that is, any application and operating system (OS) can be exploited. Hardware-based architectures protect applications from a malicious OS. No software-only solution can provide these guarantees, because an attacker can continuously manipulate the software when the OS is untrusted. An attacker cannot modify hardware functionality when it is considered immutable. erefore, a user's trust is claimed to be rooted in the hardware, making hardware-based architecture only considered in this study.
Hardware-based remote attestation mainly depends on a secure chip, TPM. It is a hardware cryptographic module consisting of an execution engine, volatile memory, and nonvolatile storage. e engine is designed for hash algorithms, Rivest-Shamir-Adleman (RSA) key generation, encryption, signing, and random number generation. e chip has a set of special registers called PCRs. ese PCRs can be classified into two groups, namely, static and dynamic PCRs, in accordance with their initial value and the time that they can be reset. Static PCRs, PCR 0-16, are reset to 0 on system reboot. Dynamic PCRs, PCR 17-23, are initialized as −1 and 0 at reboot and run-time, respectively. e two PCRs can only be updated through the extend function that aggregates the current content of a PCR with a new content, hashes them, and sends the result back to the PCR. is promising technique can provide two important services, namely, secure storage and platform attestation. In recent years, we have conducted several studies [23][24][25][26] on platform attestation and its application. As another technical branch addressing the issue, a trusted environment is isolated in the CPU [27,28].
e abovementioned studies show that the research on quality control has attracted increasing attention by considering the behavioral traces of workers. However, limited researches have been conducted on verifying the authenticity of behavioral traces. Although hardware-based remote attestation has gained many achievements in other areas, it has limited application in crowdsourcing, especially on the current issue. e results provide references and guidelines for the current research on trustworthy behavioral traces of crowd workers.

Problem Formulation
is study considers an issue about ensuring the authenticity of crowd workers' behavioral traces that are used to estimate low-quality answers in crowdsourcing quality control. In the crowdsourcing scenario, employers (called requesters) recruit employees (workers) who complete tasks and earn wages (rewards). e behavioral traces of crowd workers are considered the evidence for detecting low-quality outputs. In this section, this scenario is first modeled to clearly show the current work. en, we formulate the current problem as a RAP and discuss its security, because the final goal is to design a secure RAP for finding fake behavioral evidence. We present a threat model based on several assumptions. Figure 1 illustrates the basic structure of the proposed model, where the behavioral traces of crowd workers are used to estimate the quality of outputs. In this scenario, requesters submit tasks to the crowdsourcing platform and receive the answer from the platform. e platform assigns tasks to workers using an allocation strategy. Workers perform tasks from the platform and return the outputs back to it. e behavioral traces are captured during completion of tasks and returned to the platform that needs it. e platform estimates the output quality on the basis of the evidence. However, the authenticity of the evidence must be ensured before making a decision. In this study, the authenticity of the evidence refers to its integrity. ree key links, namely, collecting, reporting, and verifying evidence, need to be focused on to obtain authentic evidence. e entire process is formalized as a RAP on the basis of a challenge-response mechanism, as shown in Definition 1.

Problem Definition.
Definition 1. RAP. e RAP, a triple (Req, Res, Ver), consists of three polynomial-time algorithms: (1) e challenge-generation algorithm Req takes a random n as input and outputs a challenge c. We write this algorithm as c⟵Req(n) because Req may be randomized. In the proposed model, the algorithm is executed by the crowdsourcing platform.
e response-generation algorithm Res takes target's state s and challenge c as inputs and outputs a response r. e algorithm includes the collection of evidence and its report.
e verification-algorithm Ver takes received response r as input and outputs an authentication token τ ∈ 0, 1 { }, τ � 1 when the target's state s corresponds to the expected value; else, τ � 0. We write this algorithm as τ : � Ver(r). because Ver is deterministic. In the current scenario, the process is completed by the crowdsourcing platform. Requesters may also execute the algorithm. Definition 2. Exp Λ Π An adversary Λ submits one state s ′ ≠ s or one challenge c ′ ≠ c , accesses r ⟵ Res(s, c), and outputs a response r ' . e output of the experiment is defined as 1 when r ' � r; otherwise it is 0. We write Exp Λ Π � 1 when the output is 1, and we say that adversary Λ succeeds in this case.
Definition 3. SRAP. We state that a RAP is secure when a negligible function negl exists for any polynomial-time adversary Λ and a sufficiently large n such that Pr[Exp Λ Π (n) − 1] ≤ negl(n).

read Model.
To design a secure RAP-based on the trusted anchor, that is, TPM, we assume that adversaries have the following abilities: (1) Adversaries can eavesdrop on, copy, and replay messages transmitted on channels. Two types of attack may be launched by adversaries on the basis of the above description. One is that the first adversaries simulate Res(s, c) and correctly compute its output r. Another is that the second is that r cannot correctly reflect (s, c). Adversaries escape the detection of SRAP.

SRAP-I
In this section, a specific secure RAP called SRAP-I is designed by the TPM for the scenario of quality control based on workers' behavioral traces in crowdsourcing computing. We first describe the protocol in detail and provide the proof about its security.

Protocol Description.
e protocol includes three phases, namely, integrity measurement, integrity report, and integrity verification.

Integrity Measurement.
Interactions occur when performing a crowd task that involves two participants, namely, the agent and the TPM. e agent is responsible for recording the worker's behavior. e TPM is responsible for ensuring the storage security of these behavioral data. e integrity evidence of the worker's behavior is collected and securely stored in the 20th PCR, PCR 20 , after the phase. Figure 2 shows the interaction steps. In this case, the agent first sends the TPM command, TPM2_PCR_Read, to obtain the initial value of PCR 20 and write it to the SML. Second, the agent writes an event, e i , into the storage measurement list, SML, which is a common text file, when the worker makes an action.
en, the agent sends the TPM command, TPM2_PCR_Extend and event, e i to PCR 20 . e two operations may be repeatedly executed until the crowd worker completes the task.

Integrity Report.
is phase occurs whenever the challenger, the crowdsourcing platform, needs to evaluate the crowd worker. e process involves three participants, namely, the crowdsourcing platform, the agent, and the TPM. Figure 3 shows the interactions among the three participants. e process is a challenge-response protocol. In this case, the crowdsourcing platform first sends a random challenge, nonce, to the agent. en, the agent sends the TPM command, TPM2_Quote with parameters, nonce and PCR 20 to the TPM for generating a cryptographic report quote of PCR 20 . Next, the TPM signs the message that includes the value of PCR 20 and nonce using the private key of attestation identity keys, AIK sk , and returns the result, quote, to the agent. Finally, the agent generates a response, r that includes the quote, nonce, SML, PCR 20 , and the public key certificate of attestation identity keys, AIK pk , to the crowdsourcing platform.

Integrity Verification.
is process only involves the crowdsourcing platform. e platform first checks the signature of response r using the public key of the TPM, AIK pk , to determine whether it comes from a genuine TPM. Next, it checks nonce and the integrity of the SML by rehashing all its items and comparing the result with the value of PCR 20 .

Theorem 1. SRAP-I is a TPM-based secure RAP with respect to SRAP.
Proof. As previously discussed, we state that an adversary succeeds or fails when it constructs a response r ′ � r by taking s ′ ≠ s or c ′ ≠ c as inputs. Here, only integrity report is discussed because two other parts occur in one computer, and we assume that the TPM is reliable.
In the integrity report, c is random, nonce, s is the message {quote, PCR 20 , SML, AIK pk certificate}, r corresponds to the message, {quote, nonce, SML, PCR 20 , AIK pk certificate}, the part, quote, is the message nonce, PCR 20 AIK sk , and AIK sk is the private part of the attestation identity key of the TPM.
We consider two cases as follows: (1) c ′ ≠ c. In this case, the adversary might generate a new random or use an old random that is used by the challenger, the crowdsourcing platform. e adversary obtains r ′ , {quote′, nonce′, SML, PCR 20 , AIK pk certificate}. Apparently, r ′ ≠ r is true. e attack may be launched because nonce′ is a plaintext when the adversary attempts to replace nonce ′ with nonce. en, the attack is detected because message quote′ includes nonce′, and quote′ cannot be updated without the attestation identity key of the TPM. Similarly, the case using an old random cannot generate r ′ that is unequal to r.
(2) s ′ ≠ s. In this case, the adversary can edit the PCR 20 and SML because the two parts are unprotected. However, PCR 20 is embedded in quote and cannot be updated without the attestation identity key of the TPM. erefore, the adversary cannot construct a fresh s ′ ≠ s, making r ′ � r true.

Security Analysis
In this section, we analyze the abilities against attacks, including replay attack, masquerading attack, tampering attack, malicious agent, and software attack to the TPM.

Replay Attack.
In replay attack, the attesting system can replay old messages, and two cases are considered. e first case is that an adversary impersonates the crowdsourcing platform and replays old messages. e second case is that an adversary impersonates the worker and launches replay attacks. Figure 4 shows the interactions between the adversary and the worker. In this case, the adversary replays an old random and obtains an old response, because the protocol does not provide authentication for the crowdsourcing platform. In other words, the adversary succeeds when he or she performs the attack. However, the attack does not cause the genuine crowdsourcing platform to make the correct decision about the integrity of behavioral evidence. Further, the adversary cannot benefit from the attack. Figure 5 shows the interactions between the adversary and the crowdsourcing platform. In this case, the adversary replays an old response, which is an old message, {quote, nonce, SML, PCR 20 , AIK pk certificate}. e attack can be detected by the crowdsourcing platform, because it can detect the freshness of the response by the random, nonce. erefore, the adversary fails in the attack.

Masquerading Attack.
In masquerading attack, the adversary masquerades as the crowdsourcing platform or the crowd worker. e adversary can correctly interact with the crowd worker when he or she impersonates the crowdsourcing platform, because the crowd worker does not check the identity of the crowdsourcing platform before building the session, as shown in Figure 6. In this case, the adversary can obtain a fresh response, {quote, nonce, SML, PCR 20 , AIK pk certificate}, which is meaningless for the adversary. Figure 7 shows the scenario where the adversary plays the crowd worker. is case may occur because the adversary might be a legal crowd worker who also has a valid TPM. However, the TPM is different from the TPM of the crowd worker to whom the crowdsourcing platform wants to talk. Figure 7 shows the interactions between them. e crowdsourcing platform can detect the attack by checking the certificate of the attestation identity key.

Tampering Attack.
In tampering attack, the adversary tampers the response, {quote, nonce, SML, PCR 20 , AIK pk certificate}. In this case, quote cannot be tampered, because the adversary does not have the corresponding private key. us, the adversary only tampers nonce, PCR 20 or SML. e crowd platform will find the tampering by checking quote when the adversary tampers nonce or PCR 20 . e crowd platform will find the tampering by rehashing all items in the SML and comparing the result with PCR 20 when the adversary tampers the SML.

Malicious Agent.
is attack occurs when the agent does not run with the expected behavior. In other words, the agent has been tampered and cannot strictly follow the protocol steps. In this case, only the adversary attempts to rewrite the incorrect behavioral traces, because the correct result is meaningful for him or her. is condition is because the adversary wants to obtain the corresponding pay. However, three difficulties are found. First, predicting the correct behavior traces in solving a task correctly is difficult. Second, the attack can be prevented by checking the platform configurations stored in the PCR. e advantage of using this method is that other malicious software can be found. e disadvantage of this method is that it results in high traffic load. ird, modifying the software is difficult without the corresponding source code. e adversary may prefer to solve the task rather than decompiling the software.

Software Attack to TPM.
is attack involves resetting PCR 20 without rebooting the system and pushing known good values into PCR 20. e adversary must redo all operations as the known sequence of operations to mount the attack successfully. erefore, mounting the attack for the adversary is meaningless. By contrast, the crowdsourcing platform aims to perform the task in accordance with the current sequence of operations. e adversary will destroy the current protocol when the private key of attestation identity key is retrieved as a software attack to the TPM. It will use the key to sign {nonce, PCR 20 } and can generate the correct response. However, this process is extremely difficult because the private key is stored in the TPM. us, the current protocol can resist the replay attack, masquerading attack, tampering attack, malicious agent, and software attack to the TPM.

Performance Evaluation
In this section, we first introduce two datasets. e first dataset is an open and real dataset of crowd workers' behavioral traces. e second dataset is a customer dataset that is used to record behavioral traces of students when playing an intellectual game. A prototype implementation of SRAP-I is described. We conduct experiments using the selected dataset on the basis of the prototype and results. e evaluation includes two sets of experiments. In the first set, we test the runtime of three main phases, namely, integrity measurement, integrity report, and integrity verification   using the real dataset. In the second set of experiments, we study their scalability using the custom dataset with different sizes. e following experiments were conducted based on the computer system. e hardware characteristics of the system include a processor, Intel(R) Core (TM) i7-9700(3.00GHz), a RAM(16.0GB), and a TPM 2.0 device. e software components of the system include Windows 10 operating system and the TPM software stack.

Dataset of Behavioral
Traces. In the current study, two datasets are used to evaluate the current protocol. e first dataset is an open and real dataset of crowd workers' behavioral traces during relevance judging of search results, covering 106 unique workers and 3,984 HITs. e dataset, WebCrowd25K, includes three related parts as follows [29,30] where the aggregated crowd judgment differs from the original TREC assessor judgment), and each disagreement is classified in accordance with the disagreement taxonomy.
In the current study, the behavioral data contained in a JSON file are used. e file contains (key, value) pairs. e key of the JSON object contains mapping IDs, and their values describe the behavior data. is mapping ID must be matched to the "mapping" column of "crowd_judgements.csv" which records the crowd relevance responses for 25k judgments to establish a mapping between HITs and their corresponding worker behavior data.
Another dataset was collected by an intelligent game to test students' creativity. e game asked the subjects to solve the problems that are embedded in game scenarios. e program records their behavior in solving the problems. e dataset includes 3,000 files, and each file corresponds to one student. Most of the files have sizes approximately ranging from 100 kb to 150 kb. We synthesized four files with sizes of 1, 10, 100, and 1,000 kb to satisfy the requirements.

Implementation of Prototype.
e prototype is implemented on the basis of the TPM software stack implementation for Java language from Microsoft. e library provides abstraction for Windows or simulation for TPM 2.0 devices. In practice, we use a real TPM 2.0 device to provide readers a clear concept of the performance of the current protocol. e prototype can load behavior traces from the JSON file. e main functions are contained in the Java class, TrustAgent, which is implemented by the decorator pattern based on a Java class, Samples, that belongs to the library. TrustAgent provides three main functions, namely, integrity measurement, integrity report, and integrity verification. In the integrity measurement, the JSON behavior traces are first loaded from an external file. en, the traces are individually hashed and extended to the PCR by the TPM command, TPM2_PCR_Extend, and the SML is generated. In the integrity report, the signed quote is obtained by the TPM command, TPM2_Quote with the parameters, RSA signed key, signed scheme, PCR index, and random challenge, nonce.
en, we obtain the PCR value using the TPM command, TPM2_RCR_Read. Finally, the response, quote, PCR value, SML, and public part of the RSA signed key, are generated and passed to the integrity verification. In the integrity verification, the nonce and the PCR value are first compared with the corresponding parts in the quote. en, the signature is verified by the public key. Finally, we recompute the hash value of the items in the SML and compare it with the given PCR value.

Experimental Results.
In the first set of experiments, we observed the runtimes of integrity measurement, integrity report, and integrity verification on the basis of the real dataset. A total of 3,984 files of behavioral traces were used to test the runtime. Figure 8 presents the related results about the runtime of integrity measurement. Figure 8(a) shows that the operation can be completed in 3 ms for most workers' behavior traces. Figure 8(b) shows that the measurement of 99.37% workers' behavioral traces can be completed in 5 ms. e measurement of the first worker's behavioral traces consumes considerable time, which is 217 ms. is finding is because building the session in the TPM is time-consuming. Figure 9 presents the related results about the runtime of integrity report. Figure 9(a) shows that the operation can be completed in 18 ms for most workers. is finding is the same for 97.54% workers, as shown in Figure 9(b). Figure 10 concludes that 99.87% integrity verification can be us, the operation will not cost the crowdsourcing platform excessive overload. e integrity report is the most time-consuming operation among the three key operations.
In the second set of experiments, we observed the runtimes of integrity measurement, integrity report, and integrity verification as the function of the file scale using the custom dataset. e experiment mainly aims to test the scalability of three main operations, namely, integrity measurement, integrity report, and integrity verification. Figure 11(a) presents the related results about their scalability. Figure 11(b) captures the details of the runtime of the two operations, integrity report and integrity verify. Two observations are obtained. e first observation is that the integrity measurement is the most time-consuming, and the integrity verification takes the least time. e second observation is that the time of integrity measurement continuously increases with the increase in file size, which is opposite the two other operations. erefore, the scalability of the algorithm is mainly affected by integrity measurement.

Conclusion
Crowd behavioral traces have been considered as the evidence that can be used to estimate work quality in crowd computing. erefore, the integrity of the evidence must be ensured. In this study, we propose a RAP to attest the integrity of the evidence in the crowdsourcing platform. We assume that crowd workers are economical adversaries who make bad behavior to obtain many salaries from the crowdsourcing platform. e TPM, a hardware module embedded on the motherboard, is used as the trust anchor to avoid the economical adversaries. e protocol includes three phases, namely, integrity measurement, integrity report, and integrity verification. e protocol's security is proven and analyzed. e performance is evaluated on the basis of the real and custom datasets. e experimental evaluation shows that the protocol is an alternative solution for ensuring the integrity of the behavioral evidence. e protocol is secure based on SRAP and resists attacks, including replay attack, masquerading attack, tampering attack, malicious agent, and software attack to the TPM. e protocol is effective for the real crowd scenario. e conclusion that the integrity measurement will cost considerable times with the increase in file size is acceptable in practice, because the runtime is only 2 s for a 1 mb file. e long-term effects of the protocol in real crowdsourcing platforms will be evaluated in future studies.
Data Availability e data underlying this article are included within the article.

Disclosure
Any opinions, findings, conclusions, and recommendations expressed in this publication are from the authors and do not necessarily reflect the views of the sponsors.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.