Fault Tolerant PLBGSA : Precedence Level Based Genetic Scheduling Algorithm for P 2 P Grid

Due tomonetary limitation, small organizations cannot afford high end supercomputers to solve highly complex tasks. P2P (peer to peer) grid computing is being used nowadays to break complex task into subtasks in order to solve them on different grid resources. Workflows are used to represent these complex tasks. Finishing such complex task in a P2P grid requires scheduling subtasks of workflow in an optimized manner. Several factors play their part in scheduling decisions. The genetic algorithm is very useful in schedulingDAG (directed acyclic graph) based task. Benefit of a genetic algorithm is that it takes into considerationmultiple criteria while scheduling. In this paper, we have proposed a precedence level based genetic algorithm (PLBGSA), which yields schedules for workflows in a decentralized fashion. PLBGSA is compared with existing genetic algorithm based scheduling techniques. Fault tolerance is a desirable trait of a P2P grid scheduling algorithm due to the untrustworthy nature of grid resources. PLBGSA handles faults efficiently.


Introduction
As the complexity of computational problems is increasing continuously, efficient use of computational resources becomes vital.Complex tasks are causing bottlenecks in performance throughout the technical arena.Organizations around the world use high-end computational devices, servers, and supercomputers to handle complex tasks.However, all organizations are not able to purchase such devices because of budget constraints.Grid computing has come up as a crusader to solve a highly complex task [1,2].Grid utilizes existing heterogeneous computational devices spread across multiple geographical locations [3].This unification of computational resources yields manifold increase in computational capabilities.Initially central scheduler based scheduling algorithms were used by researchers to solve complex problems [4].These techniques were effective in scheduling [5] complex task; however, they have many limitations, like the fact that failure of central scheduler causes collapse of the entire grid [6].Limited capabilities of the central scheduler give way to scalability issues.Policies vary from company to company and political issues also caused the existence of central scheduler problematic [6].
Metascheduler deals with limitations of central scheduler to some extent [7].In metascheduling, all clusters have their personal scheduler.DAG based tasks [8] are scheduled over the most capable cluster.Problem with global task scheduling arises when no cluster is capable of executing complex computational task.The drawback of metascheduler is that it cannot execute gigantic tasks using miniscule clusters and single computational resources, spread across various geographical domains.P2P technologies [9] are effective enough to act as decentralized grid scheduler.Decentralization makes our grid robust against grid node failures.Moreover, structure of P2P grid does not cause scalability issues and other bottlenecks.Further, complex problems are solved efficiently using genetic algorithm.P2P [10] grid also uses genetic algorithm to obtain good results [11].Initially, to get results quickly, researchers schedule independent gigantic tasks [11] which are generated on single grid node over P2P grid resources [12].Parallel execution of such tasks over various P2P resources produced results quickly.DAG based tasks [13] require extra precision in scheduling when scheduled over grid.Intertask dependencies makes it tough to schedule subtasks of DAG based task, as efficiently as independent tasks.Researchers in [14] have used genetic algorithm to schedule subtasks of DAG based task [15].Authors of [14] have applied a genetic algorithm to find schedule for DAG based tasks in one go.The probability of finding nearly optimal results decreases as tasks of DAG are divided across various precedence levels.
Our approach says that we have to apply genetic algorithms to obtain the schedule for subtasks of one precedence level at one time.Also, if there is a single subtask at any precedence level, then we schedule subtask on P2P grid resource which gives results quickly.In this way, subtasks of DAG based task is scheduled over P2P grid resources from one precedence level to another.The probability of finding a nearly optimal schedule is higher with the approach adopted in this paper.
Rest of paper is organized as follows literature review is given in Section 2. Background of genetic algorithm and DAG based task scheduling using it is explained in Section 2. We have proposed fault tolerant precedence level based genetic scheduling algorithm for P2P Grid in Section 3. In Section 4, we have represented and discussed simulation results.Conclusion and future scope of work are discussed in Section 5.The symbols which are used throughout the paper are presented in the Abbreviations.

Literature Review of Decentralized Scheduling Techniques Using Genetic Algorithm to Schedule Tasks
Holland first explained genetic algorithm in 1975.In the last decade, genetic algorithms have been used by various researchers to schedule tasks over grid.Both independent and interdependent tasks were scheduled using genetic algorithms.Estimation of distribution algorithm (EDAs) is a new class of evolutionary algorithms.In EDAs, promising schedules are obtained by means of probabilistic model.EDAs give better schedules as compared to the evolutionary algorithms mentioned in [16].In Section 2.1, we have described some remarkable decentralized scheduling techniques using genetic algorithm to schedule independent tasks.Section 2.2 highlights decentralized scheduling algorithm employing genetic algorithm to schedule subtasks of workflow.

Decentralized Scheduling Techniques
Using Genetic Algorithm to Schedule Independent Tasks.One of the eminent papers which applied genetic algorithm to schedule independent tasks over heterogeneous resources in fully decentralized fashion is given in [17].Scheduling applications using genetic algorithms (SAGA) were proposed in [17].In the SAGA, computational nodes can connect and leave system dynamically.This system utilizes lookup services to work as decentralized scheduler.The SAGA emphasizes on splitting the task sets and heterogeneous resources into subparts.Moreover, algorithm is run on each subpart.In the SAGA, firstly scheduling request is put forward by user.By using grid monitoring service (MonALISA) [18], we obtain monitoring data and scheduling request.After this, near optimal schedule is obtained by using monitoring data and scheduling request.Execution services then execute this near optimal schedule.Discovery service is provided in the SAGA to handle failure and incorporate new computational resources.Schedule and task information of executed jobs are provided as feedback to the user.This algorithm decreases the number of generations required in a genetic algorithm to yield efficient schedule.Decentralized grid scheduling is achievable using P2P technique and it was proved in [11].In [11], after authorization and authentication checks, any grid node can issue a job submission query.Cyclone is recursively accessed to find  ×  nodes. is a parameter of the algorithm. Nodes which decrease optimization function give the first schedule.It is impossible to investigate all possible permutations of  nodes out of total ×.Only in two cases ( is exceptionally petite or  ≈ 1), we can investigate all permutations.Thus, genetic algorithm is used for the selection process to obtain nearly optimal schedules.A limitation of this algorithm is that it can only schedule independent tasks like SAGA.

Decentralized Scheduling Techniques
Using Genetic Algorithm to Schedule Subtasks of DAG Based Task.In grid environment, genetic algorithm [19] was used to schedule [20] DAG based task in [14].We know that scheduling subtasks of DAG based task over a grid is an NP hard problem.Genetic algorithm and other stochastic search algorithms are utilized to obtain near optimal schedule for DAG based task's scheduling on grid nodes.
DAG based workflow chosen by us to put into operation [14] is shown in Figure 1(a).DAG based task  is divided into 10 subtasks.These subtasks are further subdivided into four     precedence levels.Once all subtasks of previous precedence level have returned results to origin node, subtasks at next precedence level start executing in parallel.Subtasks present at the same precedence level are executed in parallel on the separate grid nodes.Virtual network topology followed to be simulated [14] by us is shown in Figure 1(b).DAG based task  is generated at origin node (0).We used genetic algorithm to schedule subtasks of  over grid nodes 0, 1, 2, and 3.
Figure 2 represents how [14] used genetic algorithm to schedule subtasks of DAG based task .Initial population of schedules is produced in [14] by arbitrarily assigning every subtask to a grid node.Offspring for the next generation are chosen from an initial population using roulette wheel selection method.Authors have applied genetic operators on these shortlisted schedules to obtain the rest of population for given generation.Genetic operators used are crossover and mutation.When stagnation in population arises, then the probability of mutation increases.From population, the best DNA representing schedule for subtasks of  is chosen.
We know subtasks of  are divided into precedence levels.Subtasks at the same precedence level can be executed in parallel on different grid nodes.A prerequisite for subtasks at level  to start execution is that subtasks at level  − 1 have finished and returned results.This sequence is followed in DNA representation of the schedule.Subtask at the first level is the first subtask at the top of DNA.If more than one subtask is assigned to the node, then the subtask which comes first in DNA will be executed first.Figure 3 explains the sequence in which subtasks of  assigned to same node will be executed.In Figure 3, subtasks at the same precedence level are having the same color.Subtasks of level 3 are represented by orange color.Level 3 subtasks will start executing once subtasks at level 2 have finished and delivered results.Level 2 tasks are represented by green boxes and will execute in parallel on different nodes.In case grid node leaves the grid, we have to reschedule all subtasks of  again using [14].
In P2P grid, nodes can leave freely.Hence, our algorithm exploits precedence level approach to handle node failures.
DNA for subtasks at level l = 3 DNA for subtasks at level l = 2 Finally, we combine efficient schedules of subtasks at all precedence levels and combine them to obtain finish time for task t.
all P2P grid nodes Schedule on P2P grid node having minimum finish time

Calculate finish time of on all P2P grid nodes
Schedule on P2P grid node having minimum finish time  based genetic algorithm (PLBGSA) to schedule subtasks of DAG based task .In DAG based task, subtasks have intertask dependencies.In addition to schedule subtasks over various grid nodes, we have to find out the associated computation and communication cost [21].We store in all P2P nodes a list  mentioned in [22].This list  is modified whenever scheduling happens.Accordingly, neighbors also modify their list .In [23], authors put forward the concept of workload (computing field) of subtask over any P2P grid resource.It is given in (1) as follows: where Pr represents the total number of processing elements present in P2P grid node.MIPS Pr the gives number of million instructions per second single processing element can process.  is size in million instructions of th waiting subtask in the task queue of  length on grid node.Communication cost [22] is the time to send subtask from one node to another, explained by (2) as follows: In the above equation, Wsz  is window size and RT  is round trip time between nodes  and .Size of subtask  in Kb is represented by  Kb  .Subtask will also depend upon time consumed to finish subtasks at a previous precedence level and to return results to the origin node.Previous algorithms, have used DNA containing details of all subtasks of DAG based task .Genetic algorithm was applied using initial population of randomly generated DNAs.Task  shown in Figure 1(a) is divided into precedence levels.In our proposed approach, when any precedence levels contain only one subtask, we need not to apply genetic algorithm on that particular subtask.Instead, we calculate finish time of subtask on all available P2P grid nodes.Finally, we schedule single subtask on the node which gives fastest result.As shown in Figure 4, single subtask 0 is scheduled without using genetic algorithm.This scheduling value for 0 is stored in list .This value will be taken as prerequisite to schedule subtasks at the next level.
On the other hand, if precedence level of subtask contains more than one subtask, we use a genetic algorithm to find good schedule.As shown in Figure 4, at precedence level 2, five subtasks are present.Schedule to finish these 5 subtasks is represented by DNA. such DNAs are randomly generated for initial generation.The  such generations are generated by applying genetic operators on shortlisted DNAs of previous generation.Crossover and mutation are genetic operators used in this paper.The roulette wheel selection technique is used to shortlist DNAs from all DNAs present at any generation.We select DNA from all these generations such that it finishes subtasks the fastest.We schedule using the best schedule among all generations.Values are stored accordingly in list .Again, we apply genetic algorithm for subtasks at level 3 and find good schedule.Scheduling is performed according to this good schedule and list  will be updated.Again, there is a single task 9 in level 4, just like in level 1.Hence, 9 is scheduled on node giving results fastest.This way all subtasks of DAG based task  are scheduled.
Algorithm for PLBGSA is shown in Figure 5.In this algorithm, first we arrange all subtasks in priority based task sequence .
When we schedule subtask, workload of the node on which subtask is scheduled will also vary.This new workload will be as follows: Here, MEF is the most efficient factor and will be the greatest of these three values.First value is the old workload on P2P grid node CF old  .Second is transport time Trt ST    to send a task from one node to another.Rbk −1 is the third value which gives time when all subtasks at previous level will be finished and had returned results.An origin node where task  is generated will use these values to make a scheduling decision.However, entities in list  are changed when we have found the best schedule using genetic algorithm.We shortlist  schedules from an initial population by applying roulette wheel selection method (RWS).This way second generation's first m schedules will be obtained from predecessors.We choose two schedules Sd  1st , Sd  2nd from these m schedules and apply CxGO, MtGO genetic operators.Two new schedules will be obtained by this method.In this manner  new schedules for second generation are obtained.MtGO is a mutation operator which will be applied more often if stagnation in schedules occurs.We shortlist Sd  smallest from Rng Sd and schedule according to Sd  smallest .Update list  according to Sd  smallest .Similarly, we calculate Sd  smallest for all levels and update list  accordingly for all levels.Finally, at level  having value Max Level, Sd  smallest represents finish time for task .The schedule obtained using this algorithm is better than the algorithm presented in [14].Our proposed algorithm is depicted in Table 1.
In Table 1, cost is the statement that takes   steps to execute and   step executes n times.Hence, we find that, in the worst case, the running time of the above scheduling algorithm is The running time of the algorithm is the sum of running times for each statement executed.We can express the above equation in the form of  3 + 2 ++ for constants , , , , and  that again depends on statement costs   ; it is thus a quadratic function of , that is,  3 .The concept of fault tolerance is also introduced in our algorithm.Fault tolerance [24] mechanism used in this approach is the modified version of fault tolerance [25,26] mechanism of our previous work [22].Two components present on all P2P grid nodes are notification generator and notification receiver in order to handle failure situation.These components either transfer or receive the three types of messages, the heartbeat message, the task completion task .Subtasks at level 2 are  1 .Hence the total number of ways of doing scheduling is  1 (here  1 =   1 ).Now the probability of finding best schedules in randomly selected  schedules is as follows: Pr.
Level 3 contains  2 subtasks; hence, the total number of ways of doing scheduling is  2 (here  2 =   2 ).Consider Pr.
Similarly, at level , Pr.
Hence, the probability of getting better schedule using our approach is higher.Moreover, we store the results after scheduling all subtasks at any precedence level; we can incorporate fault tolerance in our approach.If we schedule using [14] and if some node fails, then again we have to schedule all subtasks present at all the levels shown in Figure 17.However, PLBGSA assigns genetic algorithm for subtasks of all levels separately; hence, we reschedule subtasks at a level where node failure happened and subtasks, beyond that level.This way we obtain results much faster as shown in Figure 18.

Conclusion and Future Scope of Work
We have applied genetic algorithm in every precedence level to schedule subtasks on P2P grid nodes.Moreover, PLBGSA is better and efficient than the algorithm proposed by Pop et al. [14].Probability of finding good schedule is higher than the previous works.P2P grid resources are utilized more uniformly with PLBGSA.Further, fault detection and recovery mechanism is proposed in PLBGSA.This fault tolerance mechanism is yielding good results.We obtain near optimal schedule with a reduced number of generations in PLBGSA.
In the future scope of work, we can apply other optimization heuristics using precedence level based scheduling for P2P grid.Also, we will incorporate task duplication technique before applying genetic scheduling at each precedence level in our future algorithm.
solution to schedule subtasks of t.We select m best schedules from schedulesSch.1, 2 • • • q Sch. 1, 2 • • • m N3Most efficient DNA is selected as final

Figure 2 :Figure 3 :
Figure 2: DNA representing scheduling for subtasks of task  and use of genetic algorithm to obtain good schedule.

3 Subtask at level l = 1 Subtasks at level l = 2 Subtasks at level l = 3 Subtasks at level l = 4
for subtasks at level l = 2 solution to schedule for subtasks at level l = is selected as final Most efficient DNA is selected as final Calculate finish time of t9 on

Figure 9 :
Figure 9: Ten runs of old GA and PLBGSA, with  generations.