Novel Defense Schemes for Artificial Intelligence Deployed in Edge Computing Environment

. The last few years have seen the great potential of arti ﬁ cial intelligence (AI) technology to e ﬃ ciently and e ﬀ ectively deal with an incredible deluge of data generated by the Internet of Things (IoT) devices. If all the massive data is transferred to the cloud for intelligent processing, it not only brings considerable challenges to the network bandwidth but also cannot meet the needs of AI applications that require fast and real-time response. Therefore, to achieve this requirement, mobile or multiaccess edge computing (MEC) is receiving a substantial amount of interest, and its importance is gradually becoming more prominent. However, with the emerging of edge intelligence, AI also su ﬀ ers from several tremendous security threats in AI model training, AI model inference, and private data. This paper provides three novel defense strategies to tackle malicious attacks in three aspects. First of all, we introduce a cloud-edge collaborative antiattack scheme to realize a reliable incremental updating of AI by ensuring the data security generated in the training phase. Furthermore, we propose an edge-enhanced defense strategy based on adaptive traceability and punishment mechanism to e ﬀ ectively and radically solve the security problem in the inference stage of the AI model. Finally, we establish a system model based on chaotic encryption with the three-layer architecture of MEC to e ﬀ ectively guarantee the security and privacy of the data during the construction of AI models. The experimental results of these three countermeasures verify the correctness of the conclusion and the feasibility of the methods.


Introduction
In recent years, tens of billions of physical devices have been connected to the Internet by the IoT technology generating zillion-byte deluge of data [1]. To realize the purpose of humans to perceive data from the physical world and make decisions, AI, as an enabling technology, is introduced to intelligently process these large-scale data and identify complex patterns in the data [2]. Meanwhile, the continuously accumulated IoT data and greatly improved computing power have also played an important role in the further innovation of AI technology. Driving by this trend, AI has made substantial breakthroughs in a wide range of industries, such as computer vision [3][4][5], natural language processing [6,7], autonomous vehicles [8], and robotics [9].
In traditional AI systems, the data bulks from the IoT devices are collected and transferred to the cloud data center for processing and analyzing [10]. Although the reliability of the cloud-centric manner has been proven, large-scale data and long-distance transmission across the network can cause network node congestion and latency issues [11], which results in an inability to cope with time-sensitive AI applications. One promising paradigm of AI computing that mitigates these problems is MEC [12,13], which pushes cloud services from the network core to the network edges closer to IoT devices (data sources) and end-users. Therefore, integrating MEC and AI has become a great trend, which leads to the birth of edge intelligence or edge-enabled AI [14,15].
Edge intelligence allows to offload some lightweight and time-sensitive AI computing tasks from the cloud to edge servers near IoT devices or end-users, while retaining tedious and data-intensive computing tasks in cloud data center [14]. Specifically, Figure 1 illustrates the training and inference/test (In this paper, the terms inference and test are used interchangeably, and they refer to the same thing) process of the AI model in the edge environment. The original training datasets generated by IoT devices are stored in a cloud database for AI model training in the cloud. To achieve the high performance of the AI model, it is essential to retrain and update the AI model based on newly generated incremental data to adapt to the changes in data distribution [16]. Thus, the incremental data needs to be continuously transmitted from the IoT devices to the cloud and then be aggregated into an incremental dataset for model training. Afterwards, the cloud sends the well-trained AI model to the edge closer to the end-user where the inference is carried out. In this case, end-users can achieve real-time and timely feedback from the edge. This edge-enabled AI system greatly reduces the communication delays in the network and provides better response for end-users compared to the classic cloud paradigm. Avoiding sending all data to the cloud can reduce the expensive bandwidth required for the connection thus creating reliable connectivity due to a lower risk of network congestion [12,17].
The advent of MEC gives impeccable illusions to AI. In fact, recent studies show that AI itself would encounter several security issues, which has proved to be fragile and vulnerable. In particular, in the physical world, it is fatal to conduct evil attacks on safety-critical AI systems. For instance, AI video surveillance systems can ignore those being monitored who carry simple printed patterns and attempt to hide themselves intentionally [18]. Even more, autonomous vehicles can be maliciously manipulated against patches with a visually natural characteristic [19]. In an automatic speech recognition model, by injecting a small noise, any speech waveform can be tampered with and becomes a completely different target phrase [20]. Therefore, it turns out that most AI systems are inherited with many security risks. Coincidentally, these AI systems are strongly associated with and empowered by MEC due to its time-sensitive requirement.
Specifically, the development of AI is driven by massive amounts of high-quality data. The quality and security of the collected data directly affect the performance of the AI model and then threaten the AI applications mentioned above in terms of security. Recent research has shown that attackers can incorporate specially processed fake data into the incremental dataset to undermine the integrity of the training data [21]. This situation is vividly described as a "poisoning attack." Note that the original data of model training is less likely to be tampered with due to its confidentiality. For example, Shafahi et al. [22] carried out a cleanlabel poisoning attack by designing harmful images in the feature space around the target image to the aim of changing the decision boundary of the AI classifier. In addition, attackers can implant hidden backdoor samples into the training data, and the triggered backdoor will induce the model to make mistaken decisions, which is "backdoor attack." Gu et al. [23] first found that the AI model can be embedded in the highly concealed backdoor which is difficult to perceive, except for the backdoor generator. It can be seen that the continuous accumulation of large amounts of data provides more opportunities for the injection of malicious data and also makes it the most direct and effective way to launch attacks during model training, as shown in Figure 1.
In addition, studies have demonstrated that adding some imperceptible disturbances to original test samples can deceive well-trained AI models creating a false inference result with a high-confidence [24]. This is known as the adversarial example which takes into account the defect that the causality of the data is not obtained in the design of the AI model. Particularly, the transferable nature of adversarial examples has garnered continued attention from both the industry and academia [25]. Performing adversarial attacks by maliciously requesting inference (see Figure 1) will bring a fatal blow to the AI system. Regarding the issue of privacy disclosure of AI models (see Figure 1), a study has clarified that it is possible to infer and filch data used for model training even if little is known about the parameters and structure of the model [26]. The excessive collection of personal data caused by AI applications undoubtedly increases the risk of privacy disclosure.
To mitigate the corresponding security threats towards the AI system, in this paper, we propose three defense countermeasures against the threats of AI model training, inference, and private data. First of all, to solve the security challenges that AI faces in the training phase, we implement a cloud-edge collaborative antiattack scheme with a threestep design. This scheme fully takes the near-user advantages of MEC to realize a reliable incremental updating of AI by ensuring the data security generated in the training phase. Moreover, to effectively address the security problem in the inference stage of the AI model, we propose a defense strategy that includes two mechanisms, namely traceability and adaptive punishment mechanisms. With the help of the edge, the purpose of the strategy is to more effectively and radically eliminate attack intent in the inference phase. Most importantly, to protect the security and privacy of a large amount of training data required for the construction of AI models, in this paper, we propose a data transmission model based on chaotic encryption technology. This model aims at cutting off the transmission of raw data (unencrypted data) and protecting the security and privacy of the data from the propagation path. At present, most existing works are to defend against one of the aforementioned attacks, which will be elaborated separately in Section 2. In our work, a complete AI security challenge solution that includes three schemes was explored to deal with various potential attacks. Additionally, our motivation for using MEC to empower defense strategies stems from the various advantages that MEC brings to AI itself. The value of edge layer that can improve defense response speed has not been considered in terms of AI security defense strategy in most existing works. In particular, the existing cloud-based defense and data protection strategies are often time-consuming and high-complexity. At last, it is worth noting that the three schemes we proposed demonstrate their originality and effectiveness in 2 Wireless Communications and Mobile Computing system model construction, such as the chaotic encryption technology and the adaptive penalty mechanism-based game model. The main contributions of this paper are summarized as follows: (1) We innovatively develop a cloud-edge collaborative antiattack scheme that includes an edge-filtering unit (EFU) running on the edge gateway, a cloud data security center (DSC), and a cloud model training center (MTC) running on the cloud server. The experimental results prove that with the help of MEC, reliable incremental updates of the AI training phase can be achieved (2) We explore an edge-enhanced defense based on adaptive traceability and punishment against the security threat in the inference stage of the AI model. Its innovation lies in the establishment of an adaptive punishment mechanism based on evolutionary game theory, which can completely suppress the attack behavior and fundamentally eliminate the attack intention by setting reasonable penalty factors. Compared with traditional evolutionary games, the introduction of the adaptive penalty factor can ignore the effects of various initial proportions (3) We propose a three-tier architecture that marginalizes AI and delivers tasks that must be completed by the cloud layer in the traditional AI to the edge layer to assist in the completion. Such an architecture reduces the burden of transmitting data from the cloud layer and broadband and reduces transmission and response delays, making smart devices perform better. To effectively protect the privacy problem of data, a scheme for encrypting the transmitter based on chaotic synchronization is proposed. The scheme has strong applicability and high confidentiality performance and is an effective means of data privacy protection.

Wireless Communications and Mobile Computing
The rest of this paper is structured as follows. Section 2 goes over diverse defensive strategies against security threats to AI. Section 3, Section 4, and Section 5 describe the three defense systems, respectively, in detail. Section 6 reveals the experimental results corresponding to the three proposed defense systems. In Section 7, our conclusions are presented.

Related Work
In response to the above security issues, many researchers have proposed different solutions. The existing security strategy for poisoning attacks in the model training phase mainly focuses on data detecting and sanitization techniques. Laishram et al. [27] controlled the training dataset by detecting and filtering malicious training samples during the retraining to realize the defense against poisoning attacks. Paudice et al. [28] introduced an anomaly detection model based on similar-distance filtering mechanism. Steinhardt et al. [29] proposed a defense framework that is allowed to detect and remove outliers outside the feasible training dataset. As for backdoor attacks, the countermeasures based on the detector construction are still applicable. Wang et al. [30] demonstrated that their proposed detection and reconstruction system can be well promoted against backdoor attacks faced by neural networks. Liu et al. [31] inhibited backdoor neurons by cutting out some neurons in the model. Chen et al. [32] detected and removed neurons without the need for any training dataset to ensure that deep neural networks are protected from backdoor attacks. Although the above defense technologies have proved effective to some extent, there are still some limitations. Due to the high concealment of security threats caused by poisoning and backdoor attacks during the training phase and the long delay of detection, current defense methods are facing challenges. Furthermore, traditional cloud detector-based defense methods cannot resolve the limitations of the cloud-computing paradigm, which results in its centralized execution in the cloud. The lack of up-to-date threat data of the cloud has caused detectors to respond to new types of attacks too slowly.
The defense method of adversarial attack in the inference phase mainly focuses on adversarial training, gradient masking, and input transformation. Szegedy et al. [24] mixed adversarial samples with training data to enhance the robustness of the model while reducing the sensitivity of the model to adversarial samples. But this method is difficult to deal with large-scale training data. Kurakin et al. [33] and Tramer et al. [34] proposed more efficient adversarial training methods to deal with large-scale data. Ross et al. [35] also improved model robustness by smoothing and regularizing the gradient of the model. Meng et al. [36] detected and reformed the adversarial examples towards the manifold of normal examples before the samples entered the model. Existing methods cannot guarantee the removal of all adversarial disturbances in low-complexity tasks. In addition, due to the diversity of adversarial samples, multiple defense models need to be trained to detect or resist more types of adversarial sample attacks, so that the defense model has the ability to generalize all adversarial samples as much as possible. But such an iterative training process greatly increases the time consumption of defensive strategy generation. To avoid the mentioned defects, we regard the adversarial attack as a competitive game between end-users who implement model inference. Game models are widely used to solve security-related problems. For example, Sun et al. [37] illustrated an evolutionary game model to protect user cooperation security of fog computing. We believe that the game model can also play an essential role in resisting the adversarial attacks suffered during the AI inference.
The privacy protection method of AI data is mainly based on differential privacy and homomorphic encryption technology to implement the encryption of training data and output information [38,39]. Huang et al. [40] proposed a distributed learning algorithm based on alternating direction method of multipliers, which can be applied to a wider range of distributed learning problems and guarantees a balance between practicability and privacy protection. Differential privacy technology is less robust and inefficient, and it is difficult to resist sophisticated attacks. Aono et al. [41] proposed a system that utilizes homomorphic encryption technology to protect data privacy and avoid leaking local data information to curious servers. Homomorphic encryption technology still has certain limitations, for example, it does not support noninteger data [42]. The chaotic model is highly nonlinear and extremely sensitive to initial values [43]. Therefore, chaotic encryption technology is widely used in many fields such as secure communication. To the best of our knowledge, chaotic encryption technology is not used in data privacy protection in machine learning. Recently, chaotic encryption technology has also been applied to the data transmission field of the Internet. Particularly, Hui et al. [44] proposed a new data transmission scheme by using the synchronization of fractional-order chaotic models, which effectively guarantees the security of data transmission in the industrial IoT. Inspired by this, we believe that chaotic encryption technology can be an effective method for data privacy protection in artificial intelligence.

The Cloud-Edge Collaborative
Antiattack Scheme 3.1. Security Issues in the Training Phase. Training is the core process in the generation of AI models, the performance of which is consequently highly associated with the quality of training data. The security threats that AI faces during the training phase are mainly attacks on training data. The original data used for model training is extremely confidential and hard to be attacked. However, the AI is constantly updated incrementally as the application scenario changes. The incremental data used for model updates is vulnerable to attacks, which mainly consists of the poisoning and backdoor attacks. Attackers make malicious samples through malicious tampering and label inversion in poisoning attacks. As shown in the left half of Figure 2, the label of a normal "cat" sample is inverted to "dog." If these toxic samples are eventually applied to incremental updates, the AI will suffer from poisoning attacks. As a result, it is extremely easy for the model to mistakenly recognize a "cat" picture as a "dog" during the inference phase. Attackers can maliciously interfere with the recognition results of "cat" photos, and the security of the AI model has been breached.
Attackers implement hidden damage to the AI model through backdoor implantation in backdoor attacks. As shown in the right half of Figure 2, a row of white pixels is implanted as a backdoor at the bottom of a normal "cat" sample with the label unchanged. The result is that any photos containing the backdoor pixels, such as the "dog" in Figure 2, will be easily recognized as "cat" while the recognition results of normal "cat" and "dog" pictures are not affected. The attacker has implemented a targeted attack on all photos through the backdoor attack.
To cope with the security threats that AI faces during the training phase, this paper proposes a cloud-edge collaborative antiattack scheme that implements the detection of training attacks by setting traps. The security scheme mainly includes an EFU, a DSC, and a MTC. Unknown samples will be added to the incremental data by default and a pending model will be incrementally updated. The pending model is deployed to the edge to identify and defend against malicious attacks by comparing its results with that of the old model.

System Model.
We propose a cloud-edge collaborative antiattack scheme in this section as shown in Figure 3. The cloud and the edge cooperate with each other to defend against backdoor and poisoning attacks that AI faces in the training phase. The cloud consists of two modules, a data security center and a model training center. Each edge server is deployed with an edge-filtering unit. First, the edgefiltering unit detects known attacks and screens for suspicious unknown threats, which are uploaded to the data security center to be identified. Then, newly identified attacks are used for the retraining of the cloud detector to update the edge-filtering unit. Finally, the cloud model training center will be used to train the security AI model and the pending AI model.

The Edge-Filtering
Unit. The EFU is set at the edge of IoT as the first line of defense against AI training attacks. It is used for preliminary filtering of poisoning and backdoor attacks. The EFU near the attack site can filter out known threats at the initial data source to reduce the unnecessary bandwidth consumption and relieve the defense pressure of the cloud security center. More importantly, it can also realize the preliminary screening of unknown threats to further improve the antiattack ability of the AI defense system. The EFU includes two modules, the known hostile data detection (KHDD) and the unknown hostile data filtering (UHDD). The sufficiently powerful MEC power enables the KHDD to implement filtering of known threats in the edge area based on detectors deployed from the cloud DSC. The core components of the UHDD are composed of the active and the pending AI models running simultaneously. The principle is to implement the screening of suspicious threats based on the difference between the two models' judgment of the same unknown data. The pending model may have been attacked; for that, it is trained from all the data uploaded from the edge during the pending period with unknown hostile data

Wireless Communications and Mobile Computing
included. The active model is not fresh but safe because it is trained based on trusted data before the pending period and officially released by the cloud. The recognition results of AI models are divided into high-confidence and lowconfidence according to the given confidence threshold. The unknown sample will be judged as the new suspicious threat if its output result is a high-confidence error in the active model and a low-confidence right in the pending model. New suspicious threat samples are then uploaded to the cloud DSC for the final threat identification. The unknown sample will be determined as the temporary trusted data if its output result is low-confidence error in the active model and high-confidence right in the pending model. Temporary trusted samples are then uploaded to the cloud MTC for the training of pending models.

The Cloud Data Security
Center. The main purpose of the cloud data security center includes the final determination of suspicious threats and the generation of threat detectors. The cloud achieves authoritative identification of suspicious threats by comprehensively using multiple identification methods on the basis of powerful computing resources. If a new suspected threat sample is determined to be an attack, it will be saved to the hostile database for updating the detector. The updated detector is first deployed to the EFU for detection and prevention of known threats. In addition, it will be used for refiltering of temporary trusted data to form a periodic secure dataset in the cloud MTC.

The Cloud Model Training
Center. The cloud model training center is used for the training of all AI models, including the secure model and the pending model. The model to be determined is based on a temporary trusted database, which is tested and filtered by the data security center for a period of time to form a periodic secure database for the generation of the secure model.
For the poisoning attack shown in Figure 2, the actual content of the malicious image, a cat and the old active model will recognize it as a "cat" with a high degree of confidence in its recognition results. But it does not match the sample label,  Figure 3: The cloud-edge collaborative antiattack scheme in the training phase of AI. and it will be classified as a high-confidence error. At the same time, the pending model which has been trained on the same toxic sample will output the correct result "dog." This pending phase only occurs during the initial attack phase with the short-term and limited attack samples and the confidence of the correct results of the pending model will not be very high. Therefore, the poisoned sample will be identified as suspicious attack data in the edge-filtering unit and cannot be used to update the final secure model. For the backdoor attack shown in Figure 2, the backdoor attack "cat" sample with the white-pixel stripe will be considered as temporary trusted data for the training of the pending model in the early stage since the backdoor attack occurs. After a period of backdoor implantation, the attacker entered the "dog" image (labeled as "cat") with the same white-pixel stripe to activate the backdoor. The old active model will recognize it as a "dog" with a high-confidence error. The pending model will recognize it as a "cat" with low-confidence correctness. The backdoor sample will be judged as a suspicious threat and added to the hostile database. As the number of attacks increases, the detector will gradually have the ability to identify the back door (the white-pixel stripe). Attack data with this backdoor characteristic in the temporary trusted database will be filtered out in the incremental update of the periodic secure database.

An Edge-Enhanced Defense Based on
Adaptive Traceability and Punishment 4.1. Security Issues in the Inference Phase. Inference refers to the stage in which a well-trained AI model is used to classify/predict real-world samples, which occurs after training.
Despite the state-of-art AI inference model is available, it suffers from the security threat of adversarial attacks at the same time. Malicious attackers use adversarially tampered test samples to deceive the AI model into producing an erroneous output with a higher probability, causing the model to produce wrong behavior, thereby achieving the purpose of attacking the AI system [24]. This adversarial example is generated by carefully interfering with the normal sample (e.g., by adding adversarial perturbation or imperceptibly little noise) so that the human eye cannot discern the difference between the adversarial sample and the original sample but can deceive the model. As shown in Figure 4, the AI model correctly classified the cat image as "cat" with lower confidence, but with the carefully constructed noise, the AI model actually identified it as "dog" with high-confidence. However, there is no doubt that both pictures clearly show the "cat." Except that tampering is not easily perceived, most existing AI models are vulnerable to adversarial examples [45,46]. There exist "universal perturbations," so that samples transformed by this perturbation enable transferring across models with different architectures or trained on different training sets [25,47]. This "transferability" allows attackers to trick the AI system in a "black box attack" [48].
To cope with the security threats that AI faces during the inference phase, this paper proposes an edge-enhanced adversarial defense based on adaptive traceability and punishment (EADATP).

System
Model. In this section, taking full advantage of the MEC, the EADATP can trace back to the test adversaries in a timely and accurate manner and impose severe punishments, to achieve comprehensive, high-efficiency, low-energyconsuming defense against inference attacks via adversarial examples. Due to its adaptive adjustment ability, the defense system is more practical in the real world. In the following, the implementation of this defense strategy will be explained in detail around the construction of traceability technology and punishment mechanism.
After a user sends a test request to the edge server, according to the testing process, the EADATP defense framework will respond in two stages, as shown in Figure 5.
Stage 1. Before the formal testing process, the user first needs to perform identity authentication, and a unique user ID is generated at the same time. After that, the user is required to sign a contract with legal benefits and accept or reject the terms of the contract. If the contract is accepted, the test sample can enter the testing process; otherwise, the edge server will refuse to provide testing services for the user. The terms in the contract indicate that users must comply with the test security provisions and that if a security incident occurs, users will be penalized. The punishment mechanism is formulated and published by the test platform, which will be explained in detail later.
Stage 2. When the sample enters the testing phase, the database deployed at the MEC layer will store the user's ID and authentication information which is used to trace the source of the malicious user after the accident. After the sample passes the test model, the user will receive the prediction result if TRUE is returned; otherwise, the traceability mechanism is required to take effect because the returned FALSE result indicates that an adversarial attack has occurred. After the traceability mechanism takes effect, the edge server sends a tracing instruction to the database, and the user can be accurately traced according to the authentication information. After that, the penalty clause in the contract is executed against the user, and the user must accept the penalty.
In the following, we consider two enabling technologies mentioned above with respect to implementing a defense scheme.

Authentication-Enhanced Traceability
Technology. For users who forge or hide IP addresses, such as using a virtual private network (VPN), traditional traceability techniques are difficult to trace. Therefore, we introduce an authentication-enhanced traceability technology, that is, users accessing the edge platform for the first time need to bind their mailbox, mobile phone number, and bank account as authentication information. Even if a malicious user logs in to the test platform with a hidden IP address, the technology can still find the attacker based on the path provided by the authentication information.

Adaptive Evolutionary-Based Punishment Mechanism.
To complete the penalty clause in the contract, an adaptive punishment mechanism based on evolutionary game theory [49] is established, which is formulated by the edge-based inference platform, so that the benefits of the user with attack    Figure 5: An edge-enhanced defense based on adaptive traceability and punishment. 8 Wireless Communications and Mobile Computing behavior after being punished are less than that of the user with normal behavior. This is a dynamic game process such that users constantly adjust their strategies in accordance with their existing interests in order to pursue their own interests in the direction of high returns. Furthermore, the test platform adaptively adjusts the punishment intensity to achieve a more reasonable punishment mechanism design. Ultimately, this punishment strategy is to reduce the motivation of the attack and fundamentally force the user to stop the attack.
As the premise of evolutionary game theory is that participants have limited rationality rather than complete rationality, this makes the mathematically based theory of practical significance, because in the actual test case, users' choices and decisions on attack or nonattack are affected by irrational impulses, emotions, and other limited rationality, rather than being restrained by several feasible strategy options that are formulated in advance. Based on the assumption of limited rationality, taking the Evolutionary Stable Strategy (ESS) [50] as the basic equilibrium theorem and the Replicator Dynamics (RD) [51] as the core, the effects of different punishment conditions on the user's behavior choice are of practical significance.
(1) Game Model Establishment. Based on the principle of evolutionary games, the basic framework of the model is given below (see Figure 6), which is based on the following assumptions.
(1) Game Players. The two parties participating in the game are denoted as user 1 and user 2 who will request the test service, as shown in Figure 6. We suppose that both parties have a relationship of interest, and individual users are equally and independently. In addition, users are affected by limited rationality such that they have the ability to perform statistical analysis and to determine the benefits of different strategies. The strategies mentioned here will be elaborated in Assumption 2 (2) Strategy Selection. Since the behavior of each population can be regarded as a strategy, both user 1 and user 2 are supposed to have two strategies, namely normal behavior and attack behavior, respectively, recorded as Action 1 and Action 2, as illustrated in Figure 6. In the initial stage of the game between the two parties, the probability of the user choosing Action 1 is pð0 < p < 1Þ, and the probability of choosing Action 2 is 1 − pð0 < p < 1Þ. We assume that the proportion of users who choose a particular behavior is equal to the probability that a single user chooses this particular behavior such that the probability that each individual of the entire user group chooses Action 1 and Action 2 is p and 1 − p, respectively (3) Consumption Cost of Users. Users requesting a test service from the edge server need to pay a certain test fee M, that is, the consumption cost of the user, which is irrelevant to the behavior strategy selected by the user. In addition, the cost is determined by the size of the test sample. Because the test process at the MEC requires a large amount of data communication, that is, a large amount of bandwidth is consumed. The larger the test sample size, the more bandwidth is consumed, resulting in higher data communication overhead. As a result, the more expensive it is for users to use the platform for testing (4) Revenue Generated by Users. If the user adopts a proper test behavior, the edge server will return the correct test result to the user, which means that the AI system is successfully applied, and the user will indirectly obtain the revenue A from the outside. This external refers to the test task assignment subject. The user and the subject have a contractual relationship. If the user takes an offensive action, he will obtain improper benefits E from the attack (5) Penalties and Rewards for Users. If the user takes an attack, the platform punishes the user by limiting the user's test service and bandwidth, which indirectly results in economic losses [37]. We suppose that the penalty term has a linear relationship with the user's test consumption cost M, that is, DðMÞ = αM, where α is set as a penalty factor and greater than 0, which is more realistic. The severity of the penalty is determined by the cost of the test. If the user chooses a proper behavior, more bandwidth will be given, which will cause financial rewards R.
Based on the proposed model framework and related assumptions, the payoff matrix of two user populations is shown in Table 1. The connotation of each parameter in the payoff matrix is illustrated in Table 2.
(2) Equilibrium Analysis of the Evolutionary. Based on the above assumptions and the user's payoff matrix, referring to the ESS (see [50]), the total (average) expected returns μ of the population user 1 that adopts the two strategies can be mathematically represented as User User with normal behavior User with attack behavior Action 1 Action 2 Evolve GAME Figure 6: User behavior game model.

Wireless Communications and Mobile Computing
where μ 1 is the expected profits when user 1 selects normal behavior and μ 2 indicates the expected profits when user 1 performs an attack. The formulae for μ 1 and μ 2 are shown as Therefore, by bringing μ 1 and μ 2 into Eq. (1), the total expected returns can be obtained as Referring to the RD theorem (see [51]), the Replicator Dynamics equation of the proportion p for user 1 is According to the conditions satisfied by the ESS, by setting Eq. (4) equal to zero, the possible equilibrium points of the RD system are p * 1 = 0, p * 2 = 1, and p * 3 = ðA + R − E + αMÞ/ðR − kαM + αMÞ.
Since p * 1 and p * 2 are fixed and p * 3 is uncertain, to determine the final evolutionary stability strategy, p * 3 needs to be discussed on a case-by-case basis. More importantly, the punishment strategy is the target that needs to be discussed, because a reasonable punishment value is tried to be formulated so that all users can reach the state of taking nonattack behaviors as quickly as possible. It is worth noting that the profits from action are the norm for behavioral evolution. The following three cases need to be discussed. Case 1. When p * 3 < 0, 0 < A + R < E − αM can be deduced, and 0 < α < ðE − A − RÞ/M can be further obtained. It can be known from these formulae that the user's rewards from being punished for adopting an attack behavior are greater than those for normal behavior, which means the punishment is lighter. Since F ′ ðp = 0Þ < 0 and F ′ ðp = 1Þ > 0, p * 1 is the user's evolutionary stable point.

Case 2.
When 0 ≤ p * 3 ≤ 1, A + R ≥ E − αM and E − kαM ≥ A can be deduced, and ðE − A − RÞ/M ≤ α ≤ ðE − AÞ/kM can be further obtained. It can be known from these formulae that due to the moderate penalties, the user's rewards from an attack are smaller than that from normal behavior. Because F ′ðp = 0Þ ≥ 0 and F ′ðp = 1Þ ≥ 0, the evolution finally reaches a stable point at p * 3 .
Case 3. When p * 3 > 1, A > E − kαM can be deduced, and α > ðE − AÞ/kM can be further obtained. It can be known from this formula that the punishment imposed on the user who takes the attack behavior is heavier than that in Case 2. Users not only have small gains but also have a loss of costs. Because F ′ ðp = 0Þ > 0 and F ′ ðp = 1Þ < 0, the evolution finally reaches a stable point at p * 2 .
According to the above analysis, it can be seen that the punishment mechanism adopted by the test platform for the user behavior evolution game system to reach a stable point at p * 2 can achieve the desired effect, that is, all users will adopt normal test behavior. For the penalty factor α, it can more intuitively measure the penalties that the platform should apply. The penalty factor α has three ranges of values, corresponding to three situations. As α keeps increasing, the user's revenue from attacks continues to decrease. Due to the existence of this punishment mechanism, users with malicious behaviors will stop attacking through the process of evolutionary learning.
Due to the lack of prior knowledge of the value of the penalty factor, it is difficult to determine a reasonable and appropriate value for the penalty factor. Although it can be known from the above analysis that choosing a larger penalty factor is more conducive to the evolution of the entire game system to a stable point p * 2 , an excessive penalty factor causes users' expected returns to be too low. Therefore, in order to choose a more reasonable penalty factor and avoid blindly adopting an excessively large penalty factor, the selfadaptive penalty factor α s is introduced instead of α in Eq. (4). The self-adaptive penalty factor can be formulated as where cð0 < c < 1Þ is a constant and β = pðt i Þ, i = 1, 2, 3, ⋯, n.
Let pðtÞ be the implicit function (solution) of Eq. (4). And p ðt i Þ is the function value of the implicit function at t = t i ði = 1, 2, 3, ⋯, nÞ. Therefore, α s can be adaptively adjusted as the function value of the implicit function changes in the time domain T. Additionally, in Eq. (5), let L = ln ðβðe − 1Þ + 1Þ, referring to Taylor's theorem [52], L can also be expanded as where R n+1 ½ðe − 1Þβ is a Taylor's remainder which can be extremely close to zero when n is approaching to infinite. Therefore, Eq. (5) can be transformed as where the harmonic series A based on Taylor's expansion is used to limit the value of α s within a reasonable range. Therefore, by setting different n, the constraint ability of A can be adjusted to meet the needs of various evolutionary game models.

Description of Problem.
While the era of big data brings great convenience to users, it also faces many challenges. For example, users face the risk of data privacy disclosure while enjoying the service [53]. The application of AI technology depends on big data, so the leakage of data privacy is also one of the significant challenges facing AI security. Homomorphic encryption technology is an effective way to protect the private data of users. In federal learning, users encrypt the key parameters of the data by using homomorphic encryption technology and send them to the cloud server to protect their private data. The cloud server uses only some algorithms to aggregate encrypted data packets to update the model parameters and then downloads the new model to each edge node [54]. Homomorphic encryption and deep learning both consume a lot of computing resources. Therefore, combining deep learning with homomorphic encryption technology will greatly increase the time for network training and inference [55]. Weighing the robustness of privacy protection and the efficiency of deep learning, the chaotic encryption technology is the best choice.

System Model.
In this subsection, we propose a data privacy protection scheme based on chaotic encryption technology in IoT, as shown in Figure 7. The user layer is the source of the data that is collected in real-time by a large number of devices. To protect the data privacy of users, the user equipment is equipped with a chaotic encryption device, which encrypts the original data and sends it to the edge device layer. The MEC is equipped with corresponding chaotic decryption devices and has the right to decrypt user data. Cloud accepts encrypted data packets forwarded by MEC without authority to decrypt. The cloud can only aggregate the large number of packets used by some algorithm to update the parameters of the model and then drop the new model to the MEC. The data trained in the cloud is encrypted data, which effectively prevents the leakage of private data. The cloud can get a lot of data from different edge nodes, which can fully optimize the parameters of the training model, making its model efficient and accurate.

Data Transmission Scheme Based on Chaotic
Encryption. We introduce two classical fractional-order chaotic models as chaotic transmitters of the user layer and chaotic receiver of the MEC layer, respectively. The fractionorder Liu's model is described as (see [56] and the references therein) where α is derivative order. It has been shown that model (8) exhibits chaotic behavior when α > 0:916. We suppose that the uncertainty Δf i and the external disturbance d i are bounded, that is, |Δf i | <η i , |d i | <κ i , where η i and κ i are positive constants ði = 1, 2, 3Þ. In this paper, the fraction-order Liu's model with the uncertainty Δf i and the external disturbance d i are described as The fraction-order Newton-Leipnik's model is represented as follows (see [56]): where β is derivative order, a and b are model parameters, and u ij ði = 1, 2, 3, j = 1, 2:Þ are the controllers. The uncontrolled Newton-Leipnik's model (10) displays chaotic Liu's model is used in the sending node to generate the chaotic signals x 1 ðtÞ, x 2 ðtÞ, and x 3 ðtÞ, and Newton-Leipnik's model is used in the sending node to generate the chaotic signals y 1 ðtÞ, y 2 ðtÞ, and y 3 ðtÞ.

Synchronization between the Drive Model and the
Response Model. The errors model of drive model (9) and response model (11) are defined as First, we rewrite drive model (9) and response model (11) into a vector form, and then Assumption 1. It is assumed that the master model and slave model uncertainties Δf ðXÞ and ΔgðYÞ and external disturbances d mi ðtÞ and d si ðtÞ are bounded by Therefore, we have Based on the above Assumption, we can get the following theorem. Theorem 1. Under Assumption 1 and the feedback controller uðX, YÞ, the drive model (9) and the response model (11) are synchronization, where uðX, YÞ = u 1 ðX, YÞ + u 2 ðX, YÞ, and

Wireless Communications and Mobile Computing
Proof. Taking the feedback controller u 1 ðX, YÞ of the response model (11) as (18), then the error model of model (9) and model (11) can be obtained in the following form: where u 2 ðX, YÞ = ðu 12 , u 22 , u 32 Þ T . That is The sliding surface for the error model is presented by where χ i are positive constants. Taking the first-order derivative of (22) yields To estimate the unknown controller parameters, appropriate update laws are derived as follows: whereq i and b δ i are the actual values of q i and δ i , and γ i and ω i are positive constants. Consider the positive definite Lyapunov functionVðtÞ = V 1 ðtÞ + V 2 ðtÞ + V 3 ðtÞ, where V i ðtÞ is presented as follows: Taking the derivative of both sides of (25) with respect to time, we have Substituting (23) and (24) into (26), we get Applying the control inputs u 12 ðtÞ as Then, we can obtain that In view of Assumption 1, we yield Similarly, using the other controllers as follows: Implementing the same calculation steps as above, we therefore obtain that By using the Lyapunov stability theory, we obtain the trajectories of the synchronization error (21) that will converge to the proposed sliding surface s i ðtÞ = 0.
Theorem 1 proves that the drive model (9) and the response model (11) are synchronized, which is a necessary condition for chaotic encryption. Note that model (9) and model (11) have different orders α and β, which is designed to enhance confidentiality. In particular, initial value conditions of model (9) and model (11), order α and order β, can be used as keys for chaotic encryption schemes. In addition, the effects of model (9) and model (11) on synchronization under uncertainties and external disturbances are considered. It means that when the model is subject to some internal or external random interference with uncertain factors, the synchronization between the two models can still be achieved stably. Our scheme not only improves the security performance of the chaotic encrypted transmission scheme but also enhances the stability and anti-interference as well as expands its applicability.

Experimental Results
In this section, we will perform simulation verification on the theoretical results. There are three main scenarios in our experimental part. First of all, we will verify the effectiveness of the antiattack scheme in the training phase against poisoning attacks and backdoor attacks, respectively. Secondly, 13 Wireless Communications and Mobile Computing considering the adjustment of different penalty factors and the introduction of adaptive penalty factors, we will simulate how the penalty mechanism affects the choice of user behavior based on ESS to verify the achievement of defense in the inference against adversarial attacks. At last, we will show the chaotic behavior of the chaotic model (8), the synchronization between the driving model (10) and the response model (11), and a continuous data signal as an example to verify the validity of the system model encryption and decryption.
Next, we will analyze these defense issues in detail.
Scenario 1. Experiments on proposed defense strategies against threats in AI training.
In our proposed scheme, there are three core components, the EFU running on the edge gateway, the DSC and MTC running on the cloud server. We implement the edge gateway using a Raspberry Pi 3 Model B+ (1.4 GHz CPU and 1 GB RAM) running Raspbian Stretch with desktop. IoT devices can be connected to the edge gateway via a wired serial peripheral interface (SPI) with Modbus Protocol or wireless interface, including Bluetooth Low Energy (BLE) and WiFi. The edge storage is extended by a 1 TB network attached storage (NAS) also based on a Raspberry Pi 3 Model B+. The cloud server is equipped with NVIDIA TITAN X GPU (NVIDIA Digits Dev Box) and Intel i7-9700K CPU (3.8 GHz). Both the edge gateway and the cloud server are connected to the Internet. Edge gateway and edge storage are locally connected over a wireless LAN.
To verify the effectiveness of the antiattack scheme in the training phase against poisoning attacks, we adopted and segmented the Dogs vs. Cats Dataset [57]. The basic secure training set was composed of 10000 images to generate an active basic model. The remaining 15000 images form 30 incremental datasets for incremental training of the model. Then, poisoning samples were randomly added into the incremental datasets to generate toxic incremental datasets. The same method is used to make toxic test dataset, and the original test dataset is retained for comparison.
We compare the experimental results of the antiattack scheme proposed in this work with the nondefense scheme as shown in Figure 8. It can be seen that the model continuously learns from toxic samples and starts to adapt to poisoning attacks as the number of incremental iteration increases, making the test error on the toxic test set decrease from 67% to 4.1% under the nondefense situation. At the same time, the ability of the attacked model to recognize the original data decreased, and the test error on the original test set increased from 1.8% to 13.94%. This shows that the AI model was successfully attacked by poisoning attacks. In the experiments that adopted the antiattack scheme, the test error gradually decreased from 67%, which is for the learning of normal incremental data. But the final test error was only 18.4%. The test results on the original test set were not affected. This shows that our antiattack scheme can effectively prevent poisoning attacks. Figure 9 shows that the poisoning attack success rate on the nondefense model is 85.45%, while it is only 2.6% on the antiattack model, which also proves the conclusion. In the antiattack process, the attack success rate has increased slightly to 17.4% due to the limited cognition of the poison with fewer attack samples in the initial stage. However, the scheme's ability to capture the attack increases with the attack success rate decline to 2.5%.
To verify the effectiveness of the antiattack scheme against backdoor attacks, we randomly added backdoor samples to the segmented incremental datasets and the test dataset. The experimental results of the antiattack and nondefense schemes against backdoor attacks are shown in Figure 10. It can be seen that the test error of the nondefense model on the backdoor test set decreases from 56% to 5.2% as the number of incremental updating iteration increases. And the result of the antiattack model decreases from 56% to 26.5%. But the test errors of the two models on the original test set are not affected because the backdoor attack is more concealed. This shows that the undefended AI model was successfully attacked by the backdoor attack, and the scheme proposed in this work effectively defends the backdoor 14 Wireless Communications and Mobile Computing attack. The Figure 11 shows that the success rate of backdoor attacks on the nondefense model is 91.4%, while that of the antiattack model is only 3.8%, which also proves the conclusion. Compared with the changing of the poisoning attack success rate, it can be seen that the attack time of poisoning attacks is shorter. The attacked model begins to show symptoms of being attacked soon. The backdoor attack has a significant latency period (no obvious symptoms of being attacked was seen during the first 10 iterations), but the attack intensity after awakening is stronger.
Scenario 2. Experiments on proposed defense strategies against threats in AI inference.
In this scenario, we will carry out a more in-depth simulation analysis to verify the effectiveness and correctness of an adaptive evolutionary-based punishment mechanism against adversarial attacks in AI inference via ESS Analysis with diverse penalty factor α and adaptive penalty factor α s .
In Eq. (4), we set A = 3, R = 2, E = 10 and M = 3 as fixed values and adjust the value of penalty factor α. The initial proportion of users who adopt a nonattack strategy is 0.2, 0.4, 0.6, and 0.8, respectively. Figure 12 simulates the effects of different penalty factors on the stability of user behavior evolution. α has different value ranges, that is, 0 < α < 1:67 in Case 1 (light green curves), 0 ≤ α ≤ 3:89 in Case 2 (dark blue curves), and α > 3:89 in Case 3 (red curves and a black curve), corresponding to lighter, moderate, and heavier punishment, respectively. When the punishment is light (α = 0:5 ), after a period of evolution, the light green curves will eventually stabilize at p * 1 such that all test users will eventually choose the attack behavior. With the increase of the penalty factor (α = 1, 1:5), the evolution speed of the game is suppressed, and adversarial behavior can be combated to a certain extent. When moderate punishment is applied to users (α = 2, 2:5, 3:5), as the penalty factor increases, although the evolution speed cannot be greatly improved, more users are ultimately willing to choose normal test behavior, and the proportion of such users has increased significantly. When greater penalties are imposed on users (α = 4, 5, 6, 10, 50), nearly a hundred percent users choose proper conduct, and as α increases, the speed of approaching the stable point p * 3 becomes faster and faster.
When p 0 is small, a larger penalty factor is usually selected to bring the evolution curve closer to the stable point p * 2 , but excessive penalties will extremely reduce the user's overall expected return. Taking p 0 = 0:2, 0:4 as an example, Figure 13 simulates the effect of the introduction of the adaptive penalty factor on the evolutionary trend of the game model. In Eq. (4), we set the original penalty factor α s0 = 4. When p 0 is small, according to Eq. (7), the adaptive penalty factor α s keeps the model evolution oscillating in the early stage, so that the evolution curve reaches a reasonable and stable p value, that is, the model has a high

15
Wireless Communications and Mobile Computing percentage of "nonattack" users in an earlier stage. In addition, if the original penalty factor is set too high, although the adaptive penalty factor reduces the model's evolution speed, it greatly improves the user's overall expected return. The evolutionary process of α s is shown in Figure 14. α s hits a higher value instantly and then rapidly oscillates down to a reasonable and stable value. Therefore, the introduction of an adaptive penalty factor can automatically and more effectively guide the setting of penalty intensity. Scenario 3. Experiments on data privacy protection.
In this scenario, we will introduce numerical simulations to verify the correctness and applicability of our conclusions. First, we need to show the chaotic behavior of model (8). The core principle of chaotic encryption technology is to rely on the nonlinear nature of chaotic models that are extremely sensitive to initial conditions. This will facilitate the encryption of the main signal and has a high level of confidentiality.
It has been shown that model (8) exhibits chaotic behavior when α = 0:95, which is depicted in (a) and (b) of Figure 15. In fact, we can start from three planes xOy, xOy, and xOy, and space O-xyz shows the chaotic characteristics of model (8) in different dimensions. We only drew two pictures, and the chaotic behavior of model (10) can be seen [56].
In view of Theorem 1, we can get the essential conclusions that the drive model (9) and the response model (11) are synchronization. As shown in Figure 7, the user needs to encrypt the signal SðtÞ and send it to the MEC. However, the chaotic model used for user encryption is different from the chaotic model used for MEC decryption. Therefore, the conclusions of Theorem 1 ensure that two different chaotic models can complete the encryption and decryption of the signal separately. To verify the effectiveness of the chaotic   Figure 14: Adaptive penalty factor α s changes over time. 16 Wireless Communications and Mobile Computing encryption scheme proposed in this paper, as an example, we take the continuous function SðtÞ = 3 cos ð0:5πtÞ as the main signal. We use the classic n-shift encryption method to encrypt the main signal. The basic principle of the n-shift encryption is to use a nonlinear function f ½xðtÞ, kðtÞ (see [58,59]) operating on the signal SðtÞ to get the encrypted signal cðtÞ, where the parameter h is chosen such that SðtÞ and kðtÞ lie within ð−h, hÞ and kðtÞ as an essential encryption key is one of the three variables generated by the drive model (9). The decryption process also uses the function f ½cðtÞ,−kðtÞ, which has the same analytical formula as (33), andkðtÞ as an essential decryption key is one of the three v4ariables generated by the response model (9). The main signal SðtÞ, encrypted signal CðtÞ, decryption signalŜðtÞ, and the synchronization between the main signal SðtÞ and decryption signalŜðtÞ are illustrated in Figures 16-19, respectively. The complete synchronization of the main signal and the decrypted signal effectively verify the correctness of the proposed encryption scheme. Data security and privacy protection have become one of the core issues in AI. We firmly believe that chaotic encryption technology will play an essential role in data security transmission and privacy protection.

Conclusions
The integration of MEC and AI is getting closer due to the agile response brought by MEC and the high performance of AI technology. But this edge-enabled AI poses some huge security threats. In this paper, we proposed corresponding defense solutions against three security threats in AI model training, AI model inference, and private data. More specifically, we mainly completed the following work: (1) We developed a cloud-edge collaborative antiattack scheme, which fully takes the near-user advantages of MEC to defend against the security threats that AI faces in the training phase, including poisoning attacks and backdoor attacks. The scheme mainly includes the EFU, the DSC, and the MTC. The EFU leverages the computing power of IoT devices and deploys both a secure and a pending model on the edge to filter out suspicious threats based on their output differences. The DSC is used to identify suspicious threats for generating and updating detectors based on the hostile dataset. The MTC trains security models and pending models for deployment to the edge. The scheme designs a three-step model updating pipeline including the filtering stage, the pending stage, and the correction stage. It realizes a reliable incremental updating of AI by ensuring the data security generated in the training phase.
(2) We studied an edge-enhanced defense strategy based on adaptive traceability and punishment mechanism to effectively solve the security problem in the inference stage of the AI model. This strategy makes full use of the advantages of MEC, so that the traceability and punishment mechanism can get a quick response to efficiently perform adversarial defense. In addition, an adaptive punishment mechanism based on evolutionary game theory has been established with the aim of completely suppressing the attack behavior and radically eliminating attack intention by setting reasonable penalty factors. Compared with traditional evolutionary games, the introduction of the adaptive penalty factor can ignore the effects of various initial proportions. Since multiple training processes are avoided, compared with existing defense methods, the establishment of this strategy has a lower time consumption. Therefore, our strategy is built on the fast response edge layer with an ultrafast adjustment approach to achieve real-time autonomous defense.

Wireless Communications and Mobile Computing
(3) We proposed a system model based on chaos encryption with the three-layer architecture of MEC to effectively guarantee the security and privacy of the data during the construction of AI models. In this model, users (IoT smart devices) deliver the encrypted data to the MEC node, which has the right to decrypt the data and perform training, so that AI can process data and respond quickly at the edge to realize edge intelligence. The MEC forwards the encrypted data to the cloud. The cloud has no right to decrypt which can only perform aggregation training on encrypted data packets and send new models to the edge nodes to continuously update the model of the edge nodes, making it more intelligent. Different from the traditional homomorphic encryption and the method based on differential privacy, in the chaotic encryption scheme, the synchronization between the startup system and the response system is the key to achieve the secure transmission of data.
In this paper, we chose two fractional-order chaotic models with uncertainty and external disturbance as the driving system and response system to increase the safety performance. Theorem 1 proved that the two models can be completely synchronized. The experiment took a continuous data signal as an example to verify the correctness of the theorem conclusion and the feasibility of the data encryption scheme. Based on the advantages of low cost, high confidentiality, and wide availability of chaotic encryption, we believed that chaotic encryption will become the most effective technical solution to solve the data security and privacy problems of AI.

Data Availability
The Dogs vs. Cats data used to support the findings of this study are available on Kaggle (see the hyperlink below).