A Task Scheduling Strategy in Edge-Cloud Collaborative Scenario Based on Deadline

Task scheduling plays a critical role in the performance of the edge-cloud collaborative. Whether the task is executed in the cloud and how it is scheduled in the cloud is an important issue. On the basis of satisfying the delay, this paper will schedule tasks on edge devices or cloud and present a task scheduling algorithm for tasks that need to be transferred to the cloud based on the cat-astrophic genetic algorithm (CGA) to achieve global optimum. The algorithm quantiﬁes the total task completion time and the penalty factor as a ﬁtness function. By improving the roulette selection strategy, optimizing mutation and crossover operator, and introducing cataclysm strategy, the search scope is expanded. Furthermore, the premature problem of the evolutionary algorithm is eﬀectively alleviated. The experimental results show that the algorithm can address the optimal local issue while signiﬁcantly shortening the task completion time on the basis of satisfying tasks delays.


Introduction
With the rise of edge computing, the convergence of cloud computing and edge computing has become a major focus [1][2][3]. Especially when we make great strides towards the digital era of the Internet of Everything, edge-cloud collaboration has become an important application in many scenes such as CDN, industrial Internet, energy, intelligent transportation, and security monitoring. Cloud computing and edge computing need to work closely together to better match the various demand scenarios, thus maximizing the value of edge computing and cloud computing collaboration. Take the example of an IoT scenario. e devices in the Internet of ings generate a large amount of data, and the data are uploaded to the cloud for processing, which will cause great pressure on the cloud. To share the pressure of the central cloud node, the edge computing node can be responsible for data calculation and storage within its own scope [4][5][6]. Cloud computing excels in global, non-real-time, long-cycle big data processing and analysis and can play an advantage in long-term maintenance, business decision support, etc. Edge computing is more suitable for local, real-time, short-cycle data processing and analysis. Edge computing can better support real-time intelligent decision making and execution of local business.
ere are some high real-time performance applications, such as industrial system detection applications, control applications, executive applications, and emerging VR/AR applications. Some scenarios require real-time performance within 10 ms or even lower [7,8]. If data analysis and processing are all implemented in the cloud, it is sometimes difficult to meet the real-time requirements of the service. It seriously affects the business experience of end customers. But, usually more studies usually consider the process of unloading, ignoring the assignment of tasks after unloading.
Tasks can be scheduled to the edge or the far cloud based on energy consumption and time delay. For the problem that needs to be processed in the cloud center, how to perform proper scheduling to achieve the goal is worthwhile research question.
Task scheduling methods in the cloud center can be divided into heuristic algorithms (such as RR and SJF), metaheuristic algorithms (based on biological incentives and swarm intelligence), and hybrid task scheduling algorithms [9]. In the scheduling process, various performance-based performance indicators such as system utilization, execution time, load balance, network communication cost, delay, and the like are used [10]. e heuristic task scheduling algorithm can easily schedule tasks and provide the best solution. However, it does not guarantee the best results and is easy to fall into partial selection. e metaheuristic algorithm is an improved algorithm based on a heuristic algorithm, which is a combination of random algorithms and local search algorithm [11][12][13]. It enables the exploration and development of search space and handles a large amount of search space information. In addition, it can use learning strategies to acquire and master information to effectively find approximate optimal solutions. Among them, genetic algorithm (GA), particle swarm optimization (PSO), and ant colony algorithm (ACO) are the most widely used evolutionary algorithms in the task scheduling in recent years [14]. However, these algorithms usually converge prematurely and are prone to finite optimally. When approaching the optimal solution, it may also swing left and right, making the convergence slower [15]. In genetic algorithms, the crossover operators become the main operators because of its global search ability and mutation operator is to become the auxiliary operator because of its local search ability. Genetic algorithms have the ability to balance the global search space with the local search space. Genetic algorithms always search for global and local spaces through crossover and mutation operators. ey cooperate with each other and monitor each other. How to effectively cooperate with the intersection and mutation operations, make the convergence faster, and jump out of the local optimum in the solution process is a valuable research content of the current genetic algorithm. is paper proposes a task scheduling strategy for edgecloud collaborative computing based on disaster genetic algorithm. Considering the meaning of the cross operation, the individual optimal retention, and the magnitude of the mutation probability in the evolutionary process, the ability to optimize convergence and the three genetic operators of the genetic algorithm are improved. A penalty factor determines the execution time objective function based on the time delay. At the same time, a catastrophic strategy was introduced to simulate the phenomenon of disasters in biological evolution. During the first 1/2 iterations, premature aging may occur and the best chromosomes of successive generations will not develop at all. erefore, we increase the probability of mutation, break the monopoly of the original gene, make the individual away from the current optimal solution into the group, increase the diversity of genes, and create new survival individuals. e algorithm we proposed can jump out of the local optimum and effectively alleviate the problem of premature convergence. e rest of this article is organized as follows. Section 2 introduces the related work. Section 3 introduces the task classification strategy. Section 4 introduces the task scheduling model in the cloud center. Section 5 introduces the CGA algorithm. Section 6 introduces the experimental and comparison results. Finally, Section 7 summarizes this paper.

Relevant Work
Research on edge-cloud collaboration is still in the initial stage, but many domestic and foreign scholars have carried out related research and achieved research results on the task scheduling problem at the edge or cloud. Ke et al. [16] proposed classifying tasks according to whether they meet the delay and energy consumption. In the scheduling of tasks in the cloud, genetic algorithms are widely studied for their adaptability to various task scheduling problems. e genetic algorithm is appropriate for various task scheduling problems.
e improvement of the genetic algorithm is mainly to improve the genetic operator and to achieve the purpose of improving the convergence speed and the performance of the classical genetic algorithm. At present, many corking algorithms have been proposed successively after experimentation and demonstration by scholars. Keshanchi et al. [17] proposed an improved heuristic-based genetic algorithm, called N-GA. e N-GA is used for the static task scheduling in the cloud. Akbari et al. [18] improved the performance of genetic algorithm by significantly changing genetic operators to ensure the sample diversity and reliable coverage of the entire space. In [19], a hybrid metaheuristic algorithm is offered, which uses the HEFT (Heterogeneous Earliest Completion Time) algorithm combined with PSO and GA to improve performance. Johnson proposed a rule-based genetic algorithm (JRGA) [20] for a two-stage task scheduling in data centers. In [11], the authors proposed a task scheduling scheme for heterogeneous computing systems built on a genetic algorithm, which maps each task to the processor according to the assigned priority to shorten the manufacturing time as much as possible. Goyal and Agrawal [21] proposed a model for scheduling a group of independent tasks on multiple machines and solved the question by combined the GA and the electoral heuristic algorithm. e goal of this model is also intended to minimize the maximum time. Kumar et al. [22] put forward a new task scheduling method, which integrated min-min algorithm and minmax algorithm in a genetic algorithm. e goal of the research is to shorten the generation time and execution time to the greatest extent [23].
However, the methods mentioned above may still fall into a local optimum when solving a multimode problem [10]. erefore, the algorithm needs some strategies to avoid this limitation. Literature [24][25][26][27][28] mentioned an integer genetic algorithm using a "catastrophe" operator. It is designed to help to jump out of the local extreme points. e bionic significance of "catastrophe" operator and the improvement of disaster genetic algorithm in solving the above problems are emphatically introduced. ese operations can mitigate the phenomenon of falling into a local optimum and premature convergence.
In addition, there are few studies that achieve the least total time based on the delay of meeting each task. erefore, based on the research of genetic algorithms, this paper raises a task scheduling algorithm called CGA based on cataclysm strategy [29], which mainly considers the time delay to achieve the minimum total execution time. And the effectiveness of the proposed algorithm is checked by experiments.

Task Classification
In the system, we consider a set of tasks to be performed, each of which comes from an edge device which is denoted as N � 1, 2, 3, . . . , N { }. e tasks include interactive gaming, natural language processing, image location, etc [16]. Each task should be completed within the deadline. Each task with three attributes is defined as Task For Task i , d i is the size of the input data for the computation, which may include program codes, input files, etc [16].exp T i is the deadline for completion of a task. data i is the length of the task. erefore, we must first classify the tasks that need to be processed to determine whether to execute in the cloud. According to the ratio of the delay of the task and the length of the task, the sensitivity of the task is determined. And finally, the tasks in the cloud will be scheduled to reduce total execution time.
Let f c i represent the computing power assigned to the Task i by the edge device. us, we can get the time of the local execution of Task i as (1) e time transferred to the cloud is defined as Rate is the upload rate of tasks transferred to the cloud; here the upload rate is a fixed value.
In order to facilitate subsequent task scheduling in the cloud, tasks need to be sorted according to sensitivity. e task sensitivity can be defined as e complete task classification process is illustrated in Algorithm 1.

Task Scheduling Model in the Cloud Center
e task scheduling problem in the cloud is how to reasonably arrange each task to multiple virtual machines so that all tasks can be completed in a shorter execution time and meet the delay as much as possible [10]. Here, the following assumptions are made: (1) ere is no interdependence between tasks and tasks (2) e size of the task and the computing speed of the virtual machine are known Definition 1. Virtual machines on physical machines: where PM i represents the host machine, Nvm represents the number of virtual machines, and VM k represents the kth virtual machine resource in the cloud environment.
Definition 2. Virtual machine resources: where IDV k is the serial number of the virtual machine and MIPS k represents the computing power of the kth virtual machine.
Definition 3. Task sequence: where Ntsk represents the number of tasks that need to be performed in the cloud and Task i represents the ith task in the task sequence.
e execution time required for each task to run on a computing resource (virtual machine) is calculated as follows: Let the task set be assigned to the kth virtual machine; then, the task completion time RT(k) on the kth virtual machine is AllNTime is the maximum completion time for each computing resource: Among them, a i,j � 0, 1 { }. And the value of a i,j indicates whether the task numbered i is executed on the virtual machine numbered j, and if it is 1, it is executed.

Algorithmic ought.
e three genetic operations of the genetic algorithm affect the convergence speed of the algorithm. is paper mainly considers satisfying the delay and minimizing the total execution time, and improves the selection operation and the crossover operation as well as the mutation operation of the genetic algorithm to generate a new generation of the population while simulating biological evolution in the iterative process. e catastrophic phenomenon in the process makes the algorithm increase individual diversity without expanding the population size, and it is easier to get rid of the optimal local trap. e algorithm flow chart is shown in Figure 1:

Encoding.
In cloud computing scheduling problem, the encoding of solutions usually uses binary coded and real coded, where real coded is multi-to-one mapping pairing encoding. e task of this paper and the virtual machine are coded by the mapping pairing method [2]. For example, if there are M vms, that is, {v 1 , v 2 , v 3 , . . . , v M }, and N tasks, that is, {Tak 1 , Task 2 , Task 3 , . . . , Task n }, the length of the code will be N and the value of each gene will come from 1 to M, as shown in Figure 2:

Fitness Function.
e fitness function represents the degree of an individual's fitness in the evolutionary process. e greater the fitness is, the easier it is to be retained in the evolutionary process. e fitness function will directly affect the performance of the algorithm and whether it can achieve the goal. In this paper, we need to consider the effect of time delay and execution time on individual fitness. e difference between execution time and deadline for each task:   i∈ [1,i] a i,j ECT i,j + T i tran − exp T i . (12) Penalty factor based on whether delay is satisfied: Because the goal is to minimize the total execution time of the task scheduling while meeting the deadline of tasks, the fitness function of this paper is designed as

Improve Roulette
Choice. e roulette selection method is also called the proportional selection method. e basic idea is that the larger the individual's adaptability is, the easier it is to be selected. e traditional roulette method can select the best individual, but it cannot guarantee that the best individual will remain to the next generation, and the subsequent crossover operation may destroy the best individual. erefore, this paper combines roulette with the best individuals to save individuals with the greatest fitness in each generation directly to the next generation and does not participate in the crossover operation or mutation operation. e remaining individuals use traditional roulette to select the progeny population. e probability Ps(j) of individual selection in traditional roulette is

Crossover.
e crossover operation of the traditional genetic algorithm is to select the number of individuals to cross according to the crossover rate, to generate a crossover operation for each of the intersecting individuals using the random function rand(1, n), and to map the two chromosomes to the segments after the location point are exchanged. Traditional crossover operations are prone to the situation of the high similarity of crossover fragments, at which time the crossover meaning becomes smaller. To this end, this paper sets a cross threshold, and only if the threshold is exceeded, the cross is considered meaningful. Otherwise, no crossover occurs. e threshold size represents the proportion of similar genes in the total gene. is operation is mainly based on the principle of preventing inbreeding and optimizing offspring in the process of human evolution. In this paper, we set the threshold to 0.8 and the crossover probability higher than 0.7 to avoid slowing down the speed of convergence rate caused by abandoning the cross operation because the similarity is too high. e specific crossover operation is shown in Figure 3:

Variation.
A mutation operator is a very important operation. ere are two purposes for introducing mutations into genetic algorithms: one is to make the genetic algorithm have local random search ability. When the genetic algorithm is close to the optimal solution neighborhood through the crossover operator, the local random search ability using the mutation operator can accelerate the convergence to the optimal solution [30]. In this case, the mutation probability should take a smaller value. e second is to enable the genetic algorithm to maintain group diversity to prevent immature convergence. At this time, the mutation probability should take a larger value. e probability of variation usually takes a small value and generally does not exceed 0.1.
In this paper, two variability values are set. When the number of iterations reaches 2/3, the mutation probability is reduced by 0.02. Determine the number of individuals that need to be mutated based on the probability of mutation, randomly select two locations on the chromosome, and exchange the values of the genes. e genic value may have not changed after the mutation operation was executed, which is equivalent to no mutation operation, and the variation operation is improved in order to ensure that the variation operation can be executed even if it is already a small probability event. If two genic values of mutation are the same, add the first random number to 1 and let it perform mutation operation with another gene point. If the first random number is still the same, increment the value by one until the value is different to ensure the mutation operation (see Figure 4).

5.2.6.
Catastrophe. After many generations of evolution, the group may obtain a locally optimal solution. At this time, the group implies a large amount of information related to the local optimum, tending to premature convergence and the possibility of jumping out by operators such as crossover operation and mutation operation. It is possible to introduce "catastrophe" strategy, obtain some useful global information, and obtain a solution far away from the original locality with a large probability so that a larger diversity can be obtained at smaller group size. It can provide more opportunities to get rid of the original local optimal solution. However, the catastrophe cannot go through evolution all the time. We should consider avoiding the problem of destroying the optimal solution and reoptimizing in the later stage.
Scientific Programming e genetic algorithm has the disadvantages of easy to fall into local optimum and premature convergence [31]. Once it falls into local optimum, it will be difficult to jump out. For this reason, we add the catastrophic strategy mentioned in the literature [28] to this paper. By increasing the mutation probability to stay away from the current optimal, the solution that is far from the current optimal solution is included in the population to jump out of the optimal local solution. Catastrophic operation is shown in Algorithm 2.

Task Classification and Scheduling Description
Step 1: classify all tasks from different devices according to Algorithm 1.
Step 2: for tasks that need to be uninstalled to the cloud, sort by sensitivity. e initial coding is optimized according to the computing power of virtual machine.
Step 3: chromosome coding and initialization of parameters.
Step 6: judge whether the optimal individual fitness of (t − 1)th generation is equal to that of the tth generation, and if so, the catastrophe threshold is reduced by one; otherwise, it will continue.
Step 7: perform selection operation, cross operation, and mutation operation.
Step 8: generate the descendant population and determine whether the catastrophe threshold cat is equal to 0 (before t/2 iterations). If equal to 0, carry on the catastrophe operation.
Step 9: if the number of iterations reaches the maximum, output; otherwise, turn to step 4.

Evaluation
In this experiment, for tasks that need to be processed in the cloud, we used CloudSim 3.0 to implement the algorithms, by adding the bindCloudletToVM method in the DAta-centerBroker class; the CGA algorithm based on the catastrophe genetic algorithm is added to carry out the simulation experiment. Data such as resource computing power and task calculations are derived from data randomly generated in MATLAB. We choose the different number of tasks, and the experimental data of different iteration times are analyzed and compared with the time-based differential evolution algorithm (TDE) and simple genetic algorithm under the same data conditions. e TDE algorithm is based on differential evolution (DE) task scheduling algorithm that minimizes the completion time. e differential evolution algorithm is also a population-based heuristic search algorithm. ere is a great similarity between differential evolution algorithm and genetic algorithm. ey all include mutation, crossover, and selection operations, but the specific definition of these operations is different from the genetic algorithm. e experimental results are shown in  Parameter setting: crossover probability crossover = 0.8, maximum evolution algebra = 200, and mutation probability is 0.03, and in order to avoid errors as much as possible, this paper will perform ten times for each group of experiments and finally get the total task completion time. e experimental values are taken as the average of ten experiments.
When the number of tasks is small, the optimal effect is not obvious. However, the optimization of the algorithm is more obvious when the number of tasks is large. But the more tasks there are, the fewer tasks that are unloaded into the cloud, because as the number of tasks increases, the task takes longer to execute. With the increase of evolutionary algebra, the proposed algorithm can converge more quickly and save more time. Figures 5 and 6 show the changes of total task execution time and adaptive value of CGA algorithm, classical genetic algorithm, and TDE algorithm under different iterations. It can be seen that the effect of the classical genetic algorithm is the worst. e CGA algorithm uses less evolutionary algebra than other algorithms to get (1) Input: Catastrophe threshold cat; (2) cat � a; (3) For (t � 0; t < G/2; t++) If (cat � 0) (10) e first third variation; (11) Else (12) Continue circulation; (13) } ALGORITHM 2: Catastrophic operation. 6 Scientific Programming better average fitness. Among them, this paper also optimizes the initial population, and CGA algorithm can find the optimal solution faster. As we all know, the solution found by genetic algorithm may not be optimal, but the experimental results show that the CGA algorithm is better than the other two algorithms, and it is easier to jump out of the local optimum and find the optimal solution. Figures 7 and 8 are comparisons between CGA algorithm and TDE algorithm. We can see that CGA algorithm can achieve the goal of this paper better. According to the experimental results, it can be seen from Figure 9 that the delay satisfaction rate of the experiment is above 95%, which can meet the demand. And the performance of CGA algorithm is better than the TDE algorithm. In addition, the CGA algorithm is superior to the TDE algorithm in the task completion time and convergence speed of the evolutionary process, and its convergence speed is significantly better than the TDE algorithm and the traditional genetic algorithm. As the number of iterations increases, the CGA algorithm can find the optimal solution better and make the convergence rate faster. e mutation strategy called cataclysm policy is designed to help the population jump out of the local extreme points [27]. It can be seen that the catastrophic strategy in this paper does not slow down the convergence rate and destroy the optimal direction. Instead, it can help the operation to continuously optimize the population and is not easy to fall into the local optimum.   Scientific Programming task is executed in the cloud and how it is scheduled in the cloud is an important issue. In the past, many heuristics and metaheuristic task scheduling strategies have been used in cloud computing or edge computing. Genetic algorithms have unique advantages that traditional methods do not have in solving complex problems such as big space, nonlinearity, and global optimization. ey have been widely used in more and more fields. In this paper, we proposed a task scheduling strategy under deadline constraint, where tasks on edge devices could select the execution place including cloud and local devices. And the goal is to minimize the execution time of all tasks. e CGA algorithm as an alternative method to solve the task scheduling problem; this algorithm adds cataclysm strategy to it. We have considered the constraint of time [5] and optimized the task scheduling. e algorithm CGA was inspired by the behavior of the extinction in the Ice Age, and it is used as a global optimization algorithm [10]. e CGA algorithm we proposed was simulated in the CloudSim environment, and the main objective was to minimize the execution time and meet delay. e results are compared with the results of existing heuristic methods such as the traditional genetic algorithm (GA) and the time-based differential evolution algorithm (TDE). From the experimental results, we can also get the conclusion that the proposed CGA can efficiently schedule the tasks to the VM and achieve our goals.

Conclusion and Future Work
In the future, we will consider improving the algorithm under conditions that are closer to the actual environment so that the algorithm can be applied to dynamic and real-time task scheduling in edge-cloud collaboration. Besides, we want to build a multi-objective version of CGA for optimizing the task scheduling problem in the cloud. Study of workflow scheduling using CGA is another future investigation. And we can also mine or forecast its potential relationships [32][33][34]. In addition, the method of task scheduling can consider many other parameters, such as the use of memory, peak of the demand, and overloads [10]. Besides, we can combine the Markov chain with the parallel computing framework and apply it in our model [35,36].

Data Availability
Because this paper only deals with time and static tasks, we used randomly generated data to export it as a dataset for the length of tasks.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.