This paper addresses the problem of task allocation in real-time distributed systems with the goal of maximizing the system reliability, which has been shown to be NP-hard. We take account of the deadline constraint to formulate this problem and then propose an algorithm called chaotic adaptive simulated annealing (XASA) to solve the problem. Firstly, XASA begins with chaotic optimization which takes a chaotic walk in the solution space and generates several local minima; secondly XASA improves SA algorithm via several adaptive schemes and continues to search the optimal based on the results of chaotic optimization. The effectiveness of XASA is evaluated by comparing with traditional SA algorithm and improved SA algorithm. The results show that XASA can achieve a satisfactory performance of speedup without loss of solution quality.
1. Introduction
In many application domains, (e.g., astronomy, genetic engineering, and military systems), increased complexity and scale has led to the need for more powerful computation resources; distributed systems (DS) have emerged as a powerful platform for addressing this issue, alternating to traditional high performance computing systems. DS consists of a set of cooperating nodes (either homogeneous or heterogeneous) communicating over the communication links. An application running in a DS could be divided into a number of tasks and executed concurrently on different nodes in the system, referred to as the task allocation problem (TAP). To improve the performance of DS, several studies have been devoted to the TAP with the main concern on the performance measures such as minimizing the execution and communication cost [1–3], minimizing the application turnaround time [4, 5], and achieving better fault tolerance [6, 7].
On the other hand, the real-time property is required in many DS (e.g., military systems). In such system, the application should complete its work before deadline, not only promising the logical correctness. While the complexity of DS could increase the potential of system failure, because in such a large and complex system, the nodes and communication links failures are inevitable. Hence, the reliability is a crucial requirement for DS, especially for the real-time DS (RTDS).
Distributed system reliability (DSR) has been defined by Kumar et al. [8] as the probability for the successful completion of distributed programs which requires that all the allocated processors and involved communication links are operational during the execution lifetime. Redundancy and diversity is the traditional technique to attain better reliability [6, 7, 9–14]. They process hardware and/or software redundancy, hence impose extra cost. Moreover, in many situations, the system configuration is fixed and we have no freedom to introduce system redundancy. Task allocation is the alternative way to improve DS reliability, and this method does not require additional resources, neither hardware nor software.
The TAP with the goal of maximizing the DSR is a typical combinatorial optimization problem; unfortunately, this has been shown to be NP-hard in strong sense, and the computational complexity of optimal algorithms (e.g., branch and bound technique) is exponential in nature. We cannot obtain the optimal results in reasonable time for large scale problems. Hence, several heuristic and metaheuristic algorithms have been implemented, such as genetic algorithm (GA) [7, 14, 15], simulated annealing algorithm (SA) [16], particle swarm optimization (PSO) [17], honeybee mating optimization (HMO) [18], cat swarm optimization (CSO) [19], and iterated greedy algorithm (IG) [20]. These algorithms may obtain suboptimal results, but they can sharply reduce the calculation time.
A common thread among these algorithms is that they all start from a randomly chosen initial solution or a set of solutions in the solution space and then repeat the exploration-decision procedure until convergence and obtaining good enough solutions (maybe suboptimal results) [21]. Here, exploration means obtaining new solutions based on the current solution, in SA algorithm; for instance, new solutions are chosen from the neighbors of the current solution. Decision is made after exploration: a new solution is either accepted or rejected according to some rules, if the new solution is accepted, then it becomes new current solution, and move on, otherwise drop it, start new exploration. According to a series of the exploration-decision procedures, the quality of the obtained solutions becomes better and better until meeting the termination condition which is often defined as some convergent situations. Hence, the convergence speed of algorithms is affected by the choice of initial solutions and rules that are applied in the exploration-decision procedures.
Simulated annealing algorithm is one of the earliest and most wildly used optimization approaches; it introduces random factor in searching process and models the annealing of solids as Metropolis process [22]. SA algorithm accepts “worse” solution with some probabilities related to the current temperature, so it can escape from the local optima and find the global optimal solution. The convergence speed of SA algorithm is depending on its initial solution and cooling schedule.
Chaos is a bounded unstable dynamic behavior that exhibits sensitive dependence on initial conditions and includes infinite unstable periodic motions [23]. Although it appears to be stochastic, it occurs in a deterministic nonlinear system under deterministic conditions. In recent years, chaotic optimization algorithm (COA) has aroused intense interests due to its ergodicity, easy implementation, and ability to escape local optima [24]. However, COA is lack of heuristic, mostly needs a large number of iterations to reach the global optimum, which means its convergence speed is slow.
In this paper, we propose a combinational algorithm called XASA (Chaotic Adaptive Simulated Annealing), where X alludes to the Greek spelling of chaos (χαoς) and which is proposed to solve the TAP in RTDS with the goal of maximizing the DSR. We take into account several kinds of constraints including deadline. XASA starts from COA and obtains several optima via its ergodicity, then SA algorithm will operate based on these optima in relative smaller ranges to find the best solution. This method can overcome the slow convergence of SA algorithm and COA without loss of solution quality.
The rest of this paper is organized as follows. Section 2 presents the related work in the application of SA algorithm and Chaotic Optimization to the TAP, and the contributions of this paper are stated. Section 3 describes the formulation of the TAP with the goal of maximizing reliability, and the solution approach is presented in Section 4 with some details of implementation. Section 5 discusses the performance evaluation of proposed algorithm according to several experiments and analyses. And Section 6 concludes this work.
2. Related Works
The idea of simulated annealing algorithm was proposed by Metropolis et al. [22] in 1953, and was applied to optimization problems by Kirkpatrick et al. [25] in 1983. To our knowledge, the first application of SA algorithm to the TAP was made by van Laarhoven et al. [26] in 1992, which applied SA algorithm to a job shop scheduling problem. From then on, several works [27–29] have been done that compare SA algorithm to other optimization algorithms for problems related to the TAP. Attiya and Hamam applied SA algorithm to solve the TAP and compared it with branch-and-bound technique in 2006 in terms of maximizing the reliability of DS [16]; extending this work, Faragardi et al. proposed improved SA algorithms to solve this problem [30, 31], using the hybrid of SA and tabu search with a nonmonotonic cooling schedule in 2012 and adding systematic search of neighborhood and memory to SA algorithm in 2013.
COA often combines with other optimization algorithm to overcome its drawbacks and take advantage of its beneficial property such as ergodicity, for example, chaotic simulated annealing (CSA) [32], chaotic particle swarm optimization (CPSO) [23], and chaotic improved imperialist competitive algorithm (CICA) [33] et al. CSA was proposed by Chen and Aihara [32] to solve combinatorial optimization problems in 1995, which used Hopfield neural networks. Then, several other papers have expanded this work [34–37]. Mingjun and Huanwen have proposed another version of CSA to solve the optimization problem of continuous functions [38]. Most of these works focus on the application of continuous functions optimization, while the method proposed by Chen and Aihara is actually based on artificial neural network, not SA algorithm. Ferens and Cook [39] adapted CSA developed by Mingjun and Huanwen into the TAP in 2013, where chaos was infused into a solution by setting the number of perturbations made by the value of a chaotic variable. However, this method does not make full use of the beneficial property of COA and does not improve the convergence speed either.
The present paper differs from the above mentioned researches because it can combine the advantages of both two algorithms. Firstly, with the ergodicity property, COA can get the skeleton of solution space by chaotic walking in it, thereby, preventing the result from falling into local optima. Secondly, based on the results of COA, we can easily determine the cooling schedule of SA algorithm which is very important to the performance of SA algorithm but hard to deal with. Lastly, several adaptive schemes are used in SA algorithm; all these schemes including COA preliminary results can increase the convergence speed significantly.
3. Problem Statement
We consider a heterogeneous DS that runs a real-time application. Each node of DS may have different processing speed, memory size, and failure rate. Moreover, the communication links may also have different bandwidth and failure rate. The state of nodes and communication links is either operational or failed, and the failure events are statistically independent. We also assume that the failure rates of nodes and communication links are constant.
There are M tasks to be executed on a DS with N nodes, and M>N for most cases. Tasks executed on the node require resources, including processing load resources and memory space. Additionally, two tasks executed on different nodes require communication bandwidth to communicate with each other. Figure 1 illustrates a simple case consisting of 5 tasks and 3 nodes. An application running on DS can be represented by a task interaction graph G(V,E) as shown in Figure 1(b), where V represents a set of tasks and E represents the interactions between two tasks. Each task i∈V is associated with two properties: {qi1,qi2,…,qiN} represents the execution time of the task at different node and [si;mi] represents the processing load and memory requirements of the task. And the label of each edge (i,j)∈E represents the communication requirements among tasks.
Results of the evaluation.
A distributed system
A task graph
The purpose of this work is to find a task assignment that all M tasks are assigned to N nodes (note that one task should be assigned to one and only one node, while one node can execute multitasks or none), so that the overall system reliability is maximized, the deadline and other requirements of tasks are satisfied, and the capacities of the system resources are not violated.
3.1. Notations
The notations that are used to formulate the problem are listed in Abbreviation Section.
3.2. System Reliability
Reliability of a distributed system may be defined as the probability that the system can run the entire application successfully [7, 14, 40]. Due to the independence of the failures of the node and path, the system reliability is the product of the components reliabilities. Which is the product of the reliabilities of all nodes and communication links.
We assume that all components of the node except processor are perfect, which means the reliability of the node equals the reliability of its processor. The reliability of the processor at time t is e-λkt [16]; under a task assignment X, the total execution time of tasks that are assigned to pk is ∑i=1Mxikeik, so the reliability of the node pk is
(1)Rk(X)=e-λk∑i=1Mxikeik.
Similarly, the reliability of the communication link lkb at time t is e-φkbt, under a task assignment X; the total communication time via lkb is ∑i=1M∑i≠jxikxjb(cij/ωkb), so the reliability of the communication link lkb is
(2)Rkb(X)=e-φkb∑i=1M∑i≠jxikxjb(cij/ωkb).
Hence, we obtain the reliability of the system:
(3)R(X)=∏k=1NRk(X)∏k=1N-1∏b=k+1NRkb(X)=e-Y(X),
where
(4)Y(X)=∑k=1N∑i=1Mλkxikeik+∑k=1N∑b=k+1N∑i=1M∑i≠jφkbxikxjb(cijωkb).
3.3. Constraints
In order to achieve a satisfactory allocation, there are several basic constraints of the TAP in RTDS that should be met. Traditionally, the aim of allocation constraints is devoted to not violate the availability of the system resources, like memory capacity. While the deadline requirement should be met for real-time property as well.
(i) Memory Constraint. The total amount of memory requirements of tasks assigned to a node should not exceed the capacity of the node. That is,
(5)∑i=1Mmixik≤Mk,∀k∈[1,N].
(ii) Computation Resource Constraints. The total amount of computation resource requirements of tasks assigned to a node should not exceed the capacity of the node. That is,
(6)∑i=1Msixik≤Sk,∀k∈[1,N].
(iii) Communication Resource Constraints. The total amount of communication resource requirements of tasks via a communication link should not exceed the capacity of the link. That is,
(7)∑i=1M∑j≠icijxikxjb≤Ckb,1≤k<b≤N.
(iv) Deadline Constraints. All tasks should complete execution before their deadline. Since there is no priority of tasks, all tasks that are assigned to a node can execute in any order. We should take account of the worst case, that is, considering task always executed at last. Hence, the deadline constraints is
(8)∑k=1Nxik∑j=1Mejkxjk≤di,∀i∈[1,M].
3.4. Problem Formulation
According to the above discussion, we can tell that maximizing the reliability of RTDS is equivalent to minimizing the object function Y(X), with all constraints mentioned before. Hence, we can formulate the TAP by the following combinatorial optimization problem:
(9)minhhhhY(X)subjectto(5)~(8).
4. Task Allocation Solution
This section describes basic SA algorithm briefly at first, with the discussion of the cooling schedule which has a significant effect on its convergence speed then presents the XASA and explains the details of how it can be applied into the TAP in terms of the statement proposed in this paper.
4.1. Basic Simulated Annealing Algorithm
The SA algorithm starts from a randomly chosen initial solution and generates a series of Markov chains according to the descent of the control parameter (i.e., temperature). In these Markov chains, a new solution is chosen by making a small random perturbation of the solution, and, if the new solution is better, then it is kept, but if it is worse it is kept with some probability related to the current temperature and the difference between the new solution and the previous solution. According to a series of iteration of solutions, an optimal one was found. The SA algorithm applied in the TAP is listed as follows.
Step 1.
Choose an initial task arrangement (Xs) at random.
Step 2.
Calculate the cost (fs) of Xs.
Step 3.
Set the initial solution as the optimal f*←fs, X*←Xs.
Step 4.
Initialize the temperature (T=T0).
Step 5.
Select a neighbor (Xn) of Xs.
Step 6.
Calculate the cost (fn) of Xn.
Step 7.
If fn≤fs, then Xs←Xn and fs←fn, otherwise go to Step 9.
Step 8.
If fn<f*, then X*←Xn and f*←fn, go to Step 10.
Step 9.
If random(0,1)<e-(fn-fs)/T, then Xs←Xn and fs←fn.
Step 10.
Repeat Step 5 to Step 9 for a given number of iterations.
Step 11.
Reduce the temperature via some cooling function T=f(T).
Step 12.
If the termination condition is satisfied (e.g. T≤Tf), then go to Step 13, otherwise go to Step 5.
Step 13.
Output the solution.
Note that the neighborhood defines the procedure to move from a solution point to another solution point [16]. In this paper, a neighbor is obtained by randomly choosing a task i among M tasks and replacing its current assigned node with another randomly selected one.
4.2. Cooling Schedule
It has been shown that the SA algorithm converges to the global optimal with probability 1 [41], which needs a sufficiently slow cooling schedule (i.e., sufficient hot initial temperature, sufficient low final temperature, and sufficient slow cooling speed). However, the required slow cooling schedule may lead to an unacceptably long solution convergence time which can be exponential.
The cooling schedule is a set of parameters, which controls the procedure of SA algorithm so that it can be asymptotic converge to a suboptimal in reasonable time. The cooling schedule is made up of these parameters.
(i) The Initial Value of the Control Parameter (i.e., Temperature) T0. The initial temperature represents one of the most important parameters in SA algorithm. If the initial temperature is very high, it will take very long time to be convergent. On the other hand, poor solutions are obtained if the initial temperature is low. A basic principle of choosing initial temperature is that the acceptance probability of worse solutions is close to 1. Which means the exchanging of neighboring solutions should be almost freely at first. Hence, we can determine the initial temperature T0 via initial acceptance probability of worse solution P0. In this paper, P0 is set to be 0.9. P0=e-Δ/T0, where Δ is the difference between the neighboring solutions, so T0=-Δ/lnP0. We can use the Δ-=fmax-fmin as the estimation of Δ, where fmax and fmin is the maximum and minimum of energy function among randomly chosen K solutions. So the initial temperature T0 is determined by the following formula:
(10)T0=fmin-fmaxlnP0.
(ii) The Cooling Function F(T). The cooling function defines the cooling method of temperature T. A commonly used type of cooling function is exponential descent function: F(T)=αT, where α is constant and α<1. So that the cooling speed depends on the parameter α, and we call it the cooling factor. A large value of α represents slow cooling, which yields good solution, but expensive in time. α is set to be slightly less than 1 in the most reported literatures, and it is chosen to be 0.95 in this paper.
(iii) The Final Value of Control Parameter Tf, Termination Condition in Another Word. The criterion for termination can be either final temperature or steady state of the system. The former one can control the total calculation time, but not the solution quality: a given final temperature may introduce calculation redundancy for small scale problem but obtain poor solution for large scale problem. On the other hand, the latter one can take into account both time and quality. In this paper, the SA algorithm will terminate if the solution remains unchanged (neither upgrading nor downgrading) for a given number of iterations. The number of iterations that solution remains unchanged is chosen to be M×N. Furthermore, the validation of the final solution should be satisfied as well, which will be discussed later.
(iv) The Length of Markov Chain Lk. It is the number of inner loop repetitions. This parameter is chosen to be M×(N-1), which is the size of solution neighborhood, because each task can be assigned to another N-1 nodes.
The cooling schedule has a significant effect on the results of the algorithm, especially on the convergence speed. Besides, the initial solution of SA algorithm can affect the convergence speed as well.
4.3. Chaotic Optimization Algorithm
The chaotic variables are produced by the following well-known one-dimensional logistic map:
(11)zk+1=μzk(1-zk),k=0,1,…,
where μ=4, zk∈[0,1]. The logistic map has special characters such as the ergodicity, stochastic property and sensitivity dependence on initial conditions. The chaotic optimization algorithm applied in this paper is listed as follows.
Step 1.
Initialize the chaotic vector Z at random, note that the value of chaotic variables cannot be 0, 0.25, 0.5, 0.75, and 1.0, which are the fixed points of logistic map, and all chaotic variables are different from each other.
Step 2.
Generate the solution vector A via Z, then generate the task assignment X via A and calculate the cost function f.
Step 3.
Set the initial solution as the optimal f*=f, Z*=Z.
Step 4.
Calculate new chaotic vector Z via formula [9].
Step 5.
Generate X as Step 2, calculate the cost function f.
Step 6.
If f<f*, then f*=f, Z*=Z.
Step 7.
Repeat Steps 4, 5, and 6, until f* remains unchanged for a given number of iterations.
Note that the iteration number in Step 7 is chosen to be as same as SA algorithm, which is discussed before, while the validation of the solution is not required in COA, since it is not heuristic and it will take very long time to get convergence if there are few valid solutions in large scale cases.
4.4. Simulated Annealing Algorithm Combined with Chaos
The basic idea of our proposed algorithm is simulated annealing combined with chaotic search and adding some adaptive schemes to the cooling schedule so that we can improve the convergence speed without loss of solution quality. The flowchart of XASA is shown in Figure 2.
The flowchart of XASA.
There are 4 schemes to speed up the convergence of SA algorithm.
First, we apply COA to find K optimized solutions, which is a preliminary search in the solution space and the solution distribution can be found according to the ergodicity of chaos system. Hence, we can search optimal solution via SA algorithm in a relative smaller range with optimized initial solution.
Second, the initial temperature T0 can also be smaller based on the results of COA because we can replace the fmax and fmin with that of K optimized solutions.
Third, the length of Markov chain Lk is constant in SA algorithm, while it is adaptive in XASA. The algorithm will jump out of the inner loop if the rejections of new solution exceed a given threshold θ. The threshold is given by the formula: θ=min(Lk×δ1,θ×δ2) at each temperature. Because P0 is set to be 0.9, we can set the initial value of θ to be ⌈Lk×0.05⌉. δ1 represents the maximum threshold, and δ1<1; in this paper, it is set to be 0.6; δ2 represents the increasing speed of θ, this is similar to the cooling factor α, so it is set to be 1.05.
Fourth, the cooling factor α is adaptive in XASA as well: the more solutions are accepted (both better and worse solutions) at the current temperature, the smaller the cooling factor is, and vice versa. The rationale is that high temperature at the beginning of the algorithm generates numerous solution acceptances; thus, a rapid reduction of temperature can be made, while fewer solutions will be accepted as the temperature is cooling down; thus, a slow cooling speed should be applied since we need carefully a solution search. Hence, the cooling factor of XASA is α′=α×e-κ/(κ+4×n), where κ is the acceptance number of last inner loop, and n is the actual length of Markov chains. Note that κ∈[0,n], so e-κ/(κ+4×n)∈[0.8187,1] and α′∈[0.7778,0.95].
To implement the algorithm, some details should be presented as follows.
(1) Solution Representation. In this paper, solutions are presented with a vector A(M,1); each element represents a task, its value is between 1 and N, denoting which node this task is assigned to. In order to apply COA, we use a chaotic vector Z related to A(M,1), where A=[Z×(N-1)+1]. A task assignment X of the TAP is generated by A as follows:
(12)X(i,k)={1,k=A(i),i=1,2,…,M,0,k≠A(i),k=1,2,…,N.
(2) Energy Function. We integrate the object function Y(X) and all constraints into a cost function to fit the SA algorithm framework. And the cost function is used as the energy function.
All constraints are formulated as penalty functions as follows:
(13)EM=∑k=1Nmax(0,∑i=1Mmixik-Mk),ES=∑k=1Nmax(0,∑i=1Msixik-Sk),EC=∑k=1N∑b=k+1Nmax(0,∑i=1M∑i≠jcijxikxjb-Ckb),ED=∑i=1Mmax(0,∑k=1Nxik∑j=1Mejkxjk-di).
As all constraints are of the same importance, we use a common coefficient γ for all penalty function. Hence, the energy function is
(14)f(X)=Y(X)+γ(EM+ES+EC+ED).
The criterion of choosing penalty function coefficient γ is that it should scale the values of the penalty functions to the comparable values as that of object function Y(X) such that the procedures of the algorithm will be toward the direction of penalty avoiding; hence, the valid solution can be found with high probability. Besides, the validation of a solution can be represented as f(X)=Y(X).
5. Performance Evaluation
To evaluate the performance of the proposed algorithm, both SA and XASA are coded in Matlab and tested for numerous randomly generated task sets that are allocated onto a RTDS. There are two variations of SA algorithm implemented in this paper, the traditional one (SA1) and the improved one (SA2). SA2 applies the last two adaptive schemes of XASA, that is, adaptive length of Markov chains and cooling factor. All other components of SA2 are as same as SA1, including initial solution, initial temperature, and termination condition. The used computation system is Matlab 7.11.0, with Intel Core i7-2600 @ 3.40 GHz and 16 Gb main memory under a Windows 7 environment.
5.1. Experiment Parameters Settings
All DS parameters are followed by the former researches [16, 17, 40]. The failure rates of processors and communication links are given in the ranges [0.00005–0.00010] and [0.00015–0.00030], respectively. The time of processing a task at different processors is given in the range [15–25]. The memory requirement of each task and node memory capacity is given in the range [1–10] and [100–200], respectively. The task processing load versus node processing capacity is given in the ranges [1–50] and [100–300]. The value of data to be communicated between tasks is given in the range [5–10]. The bandwidth and load capacity of communication links are given in the ranges [1–4] and [100–200]. The range of task deadline value is [10–200].
The network topology is star, N is set to be 12 and 25, and M is set to be [16,18,20,22,25] and [40,45,50,55,60] for two cases, respectively. The coefficient of penalty function γ is set to be 1. The number of randomly chosen solutions at the beginning of the algorithm is set to be K=10. And the initial solution of the two SA algorithms is one of these K solutions which is chosen at random. Because SA is a stochastic algorithm, each independent run of the algorithm on a same application may yield different result, we, thus, run all of the three algorithms on an application 10 times and obtain the average values.
5.2. Experiment Results
Table 1 summarizes the reliability and calculation time of all TAPs by deploying the XASA and other two SA algorithms. The title R with suffix avg represents Reliability with the average value of 10 independent runs, and T represents Time where the unit is second. Δt1 is the acceleration ratio of XASA versus SA1, where Δt1=(tSA1-tXASA)/tSA1×100%. Δt2 and Δt3 is the acceleration ratio of XASA versus SA2 and SA2 versus SA1, respectively. ΔR represents the average deviation in percentage between XASA and SA algorithms in terms of reliability, where ΔRi=(RSAi-RXASA)/RSAi×100%, i=1,2.
Simulation results for the case of 12 nodes.
Cases
XASA
SA1
SA2
Δt1
Δt2
Δt3
ΔR1
ΔR2
N
M
Ravg
tavg
Ravg
tavg
Ravg
tavg
12
16
0.9349
3.336
0.9361
39.266
0.9353
10.409
91.50%
67.95%
73.49%
0.13%
0.04%
18
0.9135
3.914
0.9152
50.188
0.9151
13.237
92.20%
70.43%
73.62%
0.18%
0.17%
20
0.8948
4.014
0.8992
57.456
0.8968
14.454
93.01%
72.23%
74.84%
0.49%
0.22%
22
0.8733
4.918
0.8781
55.815
0.8777
14.474
91.19%
66.02%
74.07%
0.55%
0.50%
25
0.8299
5.325
0.8373
84.469
0.8367
22.463
93.70%
76.30%
73.41%
0.89%
0.81%
25
40
0.6000
23.187
0.6048
377.104
0.6028
96.961
93.85%
76.09%
74.29%
0.79%
0.47%
45
0.5320
36.523
0.5334
484.005
0.5329
123.667
92.45%
70.47%
74.45%
0.25%
0.16%
50
0.4537
36.300
0.4553
606.927
0.4565
161.550
94.02%
77.53%
73.38%
0.35%
0.61%
55
0.3732
57.156
0.3753
720.068
0.3726
195.631
92.06%
70.78%
72.83%
0.58%
−0.14%
60
0.2866
157.777
0.2886
795.438
0.2860
217.155
80.16%
27.34%
72.70%
0.71%
−0.19%
Average
91.42%
67.51%
73.71%
0.49%
0.27%
The comparative results from Table 1 show that XASA can sharply reduce the convergence time against the other two SA algorithms, while the solution quality (i.e., reliability) is slightly worse (less than 0.01 in terms of value and 1% in percentage). Note that the value of Δt3 is steady in the range of 72%~75% with small variation, which means the third and fourth adaptive schemes take a fixed effect on the SA algorithm. Furthermore, Δt1 is steady in all cases except the last one (N=25, M=60) as well, which has an average value of 92.66% with standard deviation 0.1036. In our preliminary experiments, the value of function Y(X) is always below 2 during the whole algorithm, while the value of the penalty function can be hundreds. Besides, according to the experiment parameters set in this paper, there is no significant difference between nodes, nor do the tasks. Hence, there are lots of local minima in the solution space without considering the validation of solutions. Constraints are easy to be satisfied when the problem scale is small, so COA can obtain several valid solutions, and the initial temperature for the SA algorithm in the next step of XASA can be relatively small; therefore, it is fast to get convergence. On the other hand, COA can hardly obtain valid solutions in the case of large scale (e.g., N=25, M=60 in this paper), if there are insufficient valid solutions (less than K), a large initial temperature will be set because of the giant value of penalty function, this will affect the convergence speed significantly. Additionally, constraints are not so easy to be satisfied when problem scale is large, so there are much less local minima in the solution space; this will slow down the convergence speed as well. All these factors weaken the speedup effect based on COA, so the performance in the last case is not so good.
Table 2 shows some overall statics characters of all three algorithms. Where the Rstd and tstd represent the standard deviation of reliability and calculation time, V1 represents the valid solutions in percentage of COA in XASA and V2 represents that of inSA (inSA represents the SA algorithm in XASA), the other two V columns have the same meaning. As we can see, XASA is the best in terms of mean value of time standard deviation, and there is no case that SA1 excels XASA in this criterion. V2 gets the best result compared to other two algorithms as well. Note that V1 shows poor results in the large cases, this is an evidence for our analysis of the large scale issue presented before.
Simulation results for the case of 12 nodes.
Cases
XASA
SA1
SA2
N
M
Rstd
tstd
V1
V2
Rstd
tstd
V
Rstd
tstd
V
12
16
0.0016
0.3035
91.10%
100.00%
0.0019
0.8582
99.99%
0.0013
0.6208
100.00%
18
0.0008
0.6041
85.57%
100.00%
0.0020
0.9552
98.76%
0.0018
0.8322
99.21%
20
0.0062
0.4114
72.39%
100.00%
0.0029
1.3870
98.12%
0.0036
0.3914
97.63%
22
0.0016
0.6186
54.30%
100.00%
0.0015
1.3489
93.23%
0.0017
0.5886
94.07%
25
0.0049
0.7613
29.27%
99.82%
0.0038
2.9824
85.99%
0.0031
1.2888
86.88%
25
40
0.0042
1.5797
21.67%
99.87%
0.0026
8.7429
89.62%
0.0027
4.6770
91.39%
45
0.0014
5.4829
9.85%
99.67%
0.0024
12.3919
76.51%
0.0027
3.4313
81.95%
50
0.0039
5.6073
2.06%
99.30%
0.0030
11.4135
71.62%
0.0025
11.0605
79.76%
55
0.0028
8.1346
0.16%
96.58%
0.0019
8.9093
58.05%
0.0023
7.7128
67.33%
60
0.0022
9.3194
0.03%
89.76%
0.0016
15.5811
55.63%
0.0022
8.4473
67.03%
Average
0.0029
3.2823
36.64%
98.50%
0.0024
6.4570
82.75%
0.0024
3.9051
86.53%
5.3. Time Series Analysis
Figure 3 shows the values of the energy function, which are calculated at each iteration of the algorithms for the case N=12, M=20. Note that the values of the invalid solutions are set to be 0.22, because the real values of the invalid solutions are too large, and they may conceal the details of the valid ones.
Energy function at each iteration for N=12, M=20.
As we can see, COA cannot guarantee the validation of the solutions, it is completely stochastic without heuristic. However, it can generate a good start for the next step of XASA, which is shown in the results of inSA, where all solutions are valid and the convergence speed is quite fast. The inSA begins to reach to the good enough solutions before 2000 iterations. The other two SA algorithms start with the worse condition, and spend much more iterations to reach to the good enough solutions. They both need thousands of iterations to explore the solution space which generates lots of invalid solutions as COA but the cost is much more expensive. Because of the adaptive schemes, SA2 can quickly pass through invalid solutions as can be seen in Figure 3. Hence, these schemes are effective.
Figure 4 shows the details of the adaptive cooling schedule schemes, which are calculated at each cooling step of the algorithms for the case N=12, M=20. The left three figures are the results of inSA, and right ones are the results of SA2. Note that the temperature in Figures 4(c) and 4(f) is set to be unitary. At the beginning of the cooling steps, the acceptance rates of the new solutions are both high in two algorithms; hence, the cooling factor is small, and the temperature reduces rapidly, while the actual length of Markov chains n is much smaller in inSA than SA2, and the cooling factor increases faster as well. It is caused by the differences of the initial temperature and initial solution, since inSA and SA2 are actually the same. Hence, COA can truly improve the convergence speed.
Cooling schedule characters for N=12, M=20.
Inner iteration characters of inSA
Inner iteration characters of SA2
Cooling factor of inSA
Cooling factor of SA2
Temperature of inSA
Temperature of SA2
Figure 5 shows the details of the case N=25, M=60 where the invalid solution values are set to be 1.7. The result of COA is quite bad, only three valid solutions are found; therefore, inSA cannot get a good start and it has to explore wide solution space at the beginning which generates lots of invalid solutions.
Energy function at each iteration for N=25, M=60.
Figure 6 shows the results of the case N=25, M=60 as Figure 4. The cooling factor and temperature curve are not so different between inSA and SA2, and the character n cannot bring as much benefit as before. These cause the inefficient situation of the largest case.
Cooling schedule characters for N=25, M=60.
Inner iteration characters of inSA
Inner iteration characters of SA2
Cooling factor of inSA
Cooling factor of SA2
Temperature of inSA
Temperature of SA2
6. Conclusions
In this paper, we consider a heterogeneous DS that runs a real-time application, to achieve maximization of system reliability with task allocation technique. By formulating the reliability and constraints, we model this problem as a combinatorial problem. To solve this problem with fast convergence speed, we improve the well-known simulated annealing algorithm based on the analysis of the cooling schedule of SA algorithm which has a significant effect on its convergence speed. Then, we propose the algorithm XASA, which combines SA algorithm with chaotic optimization algorithm with several adaptive schemes. The experimental results show that the proposed algorithm can achieve a satisfactory performance of speedup, while the solution quality is just slightly worse.
NotationsM:
Number of tasks
N:
Number of nodes
pk:
Node k
ti:
Task i
lkb:
Communication link between pk and pb
xik:
Whether or not ti is on pk
eik:
Execution time of ti on pk
cij:
Communication cost between ti and tj
Ckb:
Communication capacity of lkb
ωkb:
Bandwidth of lkb
λk:
Failure rate of pk
φkb:
Failure rate of lkb
mi:
Memory required by ti
Mk:
Memory capacity of pk
si:
Processing load of ti
Sk:
Processing capacity of pk
di:
Deadline of ti.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors thank the anonymous referees and the editor for their valuable comments and suggestions. This work is supported by the National Natural Science Foundation of China (Grant no. 61374185).
LeeC. H.ShinK. G.Optimal task assignment in homogeneous networks19978211912910.1109/71.5772542-s2.0-0031077690AjithT. P.MurthyC. S. R.Optimal task allocation in distributed systems by graph matching and state space search1999461597510.1016/S0164-1212(98)10088-22-s2.0-0345534802AttiyaG.HamamY.Static task assignment in distributed computing systemsProceedings of the 21st IFIP TC7 Conference on System Modeling and Optimization2003KafilM.AhmadI.Optimal task assignment in heterogeneous distributed computing systems199863425110.1109/4434.7082552-s2.0-0032120998AttiyaG.HamamY.Optimal allocation of tasks onto networked heterogeneous computers using minimax criterionProceedings of the International Network Optimization Conference (INOC '03)2003Paris, FranceKartikS.MurthyC. S. R.Improved task-allocation algorithms to maximize reliability of redundant distributed computing systems199544457558610.1109/24.4759762-s2.0-0029488385HsiehC.Optimal task allocation and hardware redundancy policies in distributed computing systems2003147243044710.1016/S0377-2217(02)00456-3MR19661682-s2.0-0037410519KumarV. K. P.HaririS.RaghavendraC. S.Distributed program reliability analysis1986121425010.1109/TSE.1986.63129182-s2.0-0022576318ShatzS. M.WangJ.Models and algorithms for reliability-oriented task-allocation in redundant distributed-computer systems1989381162710.1109/24.245702-s2.0-0024645634KumarA.AgrawalD. P.Generalized algorithm for evaluating distributed-program reliability199342441642610.1109/24.2578252-s2.0-0027665955TomP. A.MurthyC. S. R.Algorithms for reliability-oriented module allocation in distributed computing systems199840212513810.1016/S0164-1212(97)00005-82-s2.0-0031999206ChiuC. C.YehY. S.ChouJ. S.A fast algorithm for reliability-oriented task assignment in a distributed system200225171622163010.1016/S0140-3664(02)00057-92-s2.0-0036836953Charles ElegbedeA. O.ChuC.AdjallahK. H.YalaouiF.Reliability allocation through cost minimization200352110611110.1109/TR.2002.8072422-s2.0-0037333206HsiehC.HsiehY.Reliability and cost optimization in distributed computing systems20033081103111910.1016/S0305-0548(02)00058-8MR19989702-s2.0-0037410939VidyarthiD. P.TripathiA. K.Maximizing reliability of distributed computing system with task allocation using simple genetic algorithm200147754955410.1016/S1383-7621(01)00013-32-s2.0-0035390490AttiyaG.HamamY.Task allocation for maximizing reliability of distributed systems: a simulated annealing approach200666101259126610.1016/j.jpdc.2006.06.0062-s2.0-33748083031YinP.YuS. S.WangP. P.WangY. T.Task allocation for maximizing reliability of a distributed system using hybrid particle swarm optimization200780572473510.1016/j.jss.2006.08.0052-s2.0-33847688473KangQ. M.HeH.SongH. M.DengR.Task allocation for maximizing reliability of distributed computing systems using honeybee mating optimization201083112165217410.1016/j.jss.2010.06.0242-s2.0-77957356429ShojaeeR.FaragardiH. R.AlaeeS.YazdaniN.A new Cat Swarm Optimization based algorithm for reliability-oriented task allocation in distributed systemsProceedings of the 6th International Symposium on Telecommunications (IST '12)November 2012Tehran, IranIEEE86186610.1109/ISTEL.2012.64831062-s2.0-84876390441KangQ. M.HeH.WeiJ.An effective iterated greedy algorithm for reliability-oriented task allocation in distributed computing systems20137381106111510.1016/j.jpdc.2013.03.0082-s2.0-84877093037BlumC.RoliA.Metaheuristics incombinatorial optimization: overview and conceptual comparision20033214877MetropolisN.RosenbluthA. W.RosenbluthM. N.TellerA. H.TellerE.Equation of state calculations by fast computing machines1953216108710922-s2.0-5744249209LiuB.WangL.JinY.TangF.HuangD.Improved particle swarm optimization combined with chaos20052551261127110.1016/j.chaos.2004.11.0952-s2.0-18444389480WnagL.ZhengD. Z.LinQ. S.Survey on chaotic optimization methods200120115KirkpatrickS.GelattJ.VecchiM. P.Optimization by simulated annealing1983220459867168010.1126/science.220.4598.671MR7024852-s2.0-26444479778van LaarhovenP. J. M.AartsE. H. L.LenstraJ. K.Job shop scheduling by simulated annealing199240111312510.1287/opre.40.1.113MR11527312-s2.0-0026461248ElesP.PengZ.KuchcinskiK.DoboliA.System level hardware/software partitioning based on simulated annealing and tabu search19972153210.1023/A:10088570081512-s2.0-0030784055WiangtongT.CheungP. K.LukW.Comparing three heuristic search methods for functional partitioning in hardware-software codesign20026442544910.1023/A:10165678288522-s2.0-0036648652FerrandiF.LanziP. L.PilatoC.SciutoD.TumeoA.Ant colony heuristic for mapping and scheduling tasks and communications on heterogeneous embedded systems201029691192410.1109/TCAD.2010.20483542-s2.0-77952945784FaragardiH. R.ShojaeeR.YazdaniN.Reliability-aware task allocation in distributed computing systems using hybrid simulated annealing and tabu searchProceedings of the IEEE 9th International Conference on High Performance Computing and Communication2012Liverpool, UKIEEE10881095FaragardiH. R.ShojaeeR.KeshtkarM. A.TabaniH.Optimal task allocation for maximizing reliability in distributed real-time systemsProceedings of the IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS '13)2013Niigata, JapanIEEE513519ChenL.AiharaK.Chaotic simulated annealing by a neural network model with transient chaos19958691593010.1016/0893-6080(95)00033-V2-s2.0-0028824103TalatahariS.Farahmand AzarB.SheikholeslamiR.GandomiA. H.Imperialist competitive algorithm combined with chaos for global optimization20121731312131910.1016/j.cnsns.2011.08.021MR2843797ZBL1241.901932-s2.0-80053385769ChenL.AiharaK.Combinatorial optimization by chaotic dynamicsProceedings of the IEEE International Conference on Systems, Man, and CyberneticsOctober 1997Orlando, Fla, USAIEEE292129262-s2.0-0031331315WangL.SmithK.On chaotic simulated annealing19989471671810.1109/72.7011852-s2.0-0032121571WangL.TianF.Noisy chaotic neural networks for solving combinatorial optimization problems4Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN '00)2000Como, ItalyIEEE374010.1109/IJCNN.2000.860745MasudaK.AiyoshiE.Solution to combinatorial problems by using chaotic global optimization method on a simplexProceedings of the 41st SICE Annual Conference2002IEEE13131318MingjunJ.HuanwenT.Application of chaos in simulated annealing200421493394110.1016/j.chaos.2003.12.0322-s2.0-1542290080FerensK.CookD.Chaotic walk in simulated annealing search space for task allocation in a multiprocessing system2013735879KartikS.MurthyC. S. R.Task allocation algorithms for maximizing reliability of distributed computing systems199746671972410.1109/12.6008882-s2.0-0000412757HajekB.Cooling schedules for optimal annealing198813231132910.1287/moor.13.2.311MR9426212-s2.0-0024012393