Novel Web Service Selection Model Based on Discrete Group Search

In our earlier work, we present a novel formal method for the semiautomatic verification of specifications and for describing web service composition components by using abstract concepts. After verification, the instantiations of components were selected to satisfy the complex service performance constraints. However, selecting an optimal instantiation, which comprises different candidate services for each generic service, from a large number of instantiations is difficult. Therefore, we present a new evolutionary approach on the basis of the discrete group search service (D-GSS) model. With regard to obtaining the optimal multiconstraint instantiation of the complex component, the D-GSS model has competitive performance compared with other service selection models in terms of accuracy, efficiency, and ability to solve high-dimensional service composition component problems. We propose the cost function and the discrete group search optimizer (D-GSO) algorithm and study the convergence of the D-GSS model through verification and test cases.


Introduction
We have proposed a novel approach for the verification of service composition with contracts [1]. The approach properties of the generic specification [2] in Tecton [3] are verified by the Violet [4] system. After verification, a global optimum is selected from a number of instantiations of web service composition components with multiple QoS constraints. Compared with other algorithms that evaluate all feasible composition instantiations (e.g., integer programming [5]), evolutionary algorithms (EAs) (e.g., genetic algorithm [6]), which are nature-inspired optimization algorithms, are simple and flexible. Given their characteristics, EAs have been used to solve the service selection problem. We proposed a novel optimization model named discrete group search service (D-GSS) that mainly employs the group search optimizer (GSO) algorithm [7]. The D-GSS model has competitive performance compared with other EAs in terms of accuracy, convergence speed, and ability to solve high-dimensional multimodal problems. On the basis that GSO can solve continuous optimization problems and that service selection can solve discrete instantiations, we present an evolutionary algorithm called discrete group search optimizer (D-GSO) to select the best instantiation that has the lowest cost evaluated by the cost function. The cost function consists of the utility function and the weight for every QoS attribute. We also verify and simulate results to analyze the convergence of the D-GSS model.
The rest of the paper is organized as follows. Section 2 describes the D-GSS model. Section 3 presents a detailed introduction of the cost function, and Section 4 discusses the D-GSO algorithm and applies the algorithm for the problem on searching for the global optimum from discrete instantiations. Section 5 introduces the convergence analysis of the D-GSS model. Finally, Section 6 concludes the paper.

Distribute Group Search Optimizer
In this paper, we present a novel algorithm named D-GSS toward the atomic service selection of composing complex services with multiple QoS constraints. The population of the D-GSO algorithm is called a group searching for unknown optima in the services composition problem and each individual in the population is called a member.

The Scientific World Journal
In the -dimensional search space about composition component, every dimension represents a class of generic service denoted as . The th member in the space is denoted as follows: where ∈ . The th member at the th iteration has a current position ∈ and is corresponding to an instantiation of services composition component.
A head angle is the position of the member; The search direction of the th member, which is a unit vector ( ) = ( 1 , 2 , . . . , ) ∈ that can be calculated from via a polar to Cartesian coordinate transformation [7]: In D-GSO based on GSO [7] inspired by animal behavior and animal searching behavior, a group consists of three types of members: only one producer is assumed to have the lowest cost at each searching bout, and the remaining members are assumed to be scroungers and dispersed members. At each iteration, a group member representing the most promising instantiation and conferring the lowest fitness value is chosen as the producer. It then stops and scans the environment to seek optimal instantiation. The scanning field is characterized by maximum pursuit angle max and maximum pursuit distance max . The apex is the position of the producer. All scroungers will join the resource found by the producer according to area copying strategy. The rest of the group members will be dispersed from their current positions for randomly distributed better instantiations. To handle the bounded search space, the following strategy is employed: when a member is outside the search space, the member will return into the search space by setting the variables that violated the bounds into their previous values.
The details of D-GSO (see Figure 1) are introduced as follows.
(i) Suppose that classes of generic services exist in the -dimensional composition component; each class has (1 ≤ ≤ ) candidate services in a special sequence.
(ii) Define the concrete cost function of the specific composition component. The cost function is defined by the QoS attributes of the component services as well as their integration relationships, such as sequential, parallel, conditional, or loop. Generate initial members from all instantiations and evaluate the members according to the cost function.
(iii) Choose a member with the lowest cost as producer.
The producer produces on the basis of the discrete GSO algorithm.
(iv) Randomly select 80% of the remaining members to perform scrounging.
(v) The remaining members will be dispersed from their current instantiations to perform ranging.
(vi) Evaluate all members according to the cost function. If no optimal instantiation with multiple QoS constraints is found, reallocate the role of every member on the value of the cost.

Cost Function
A "generic service" is a collection of atomic web services with a common functionality, but different nonfunctional properties (e.g., time and quality). Each atomic service may provide a series of QoS parameters, such as service time, cost, reliability, and availability. Users can set the number of QoS values to be considered and can set the weights of the QoS values according to their requirements. In our study, each user has QoS attribute constraints in their QoS requirements: We focus on the QoS service selection problem, in which multiple QoS constraints must be satisfied. We present the cost function to help in the selection of the best services. The following steps are involved in the creation of the cost function.
(i) Each QoS attribute must be quantitative. Service functionalities can be evaluated by several QoS properties. Some QoS attributes, for example, security and reliability, are difficult to measure quantitatively. For these criteria, we employ the linguistic expression set 1 = {VP, MP, P, M, G, MG, VG}, where VP is very poor, MP is medium poor, P is poor, M is medium, G is good, MG is medium good, and VG is very good. When calculating the cost function, set 1 is transformed into the corresponding quantitative set (1) the sequential structure, which represents services that are invoked one by one; (2) the loop structure, which represents one service that is repeated times; (3) the conditional structure, which represents only one branch that is selected to be invoked from branches; (4) the parallel structure, which represents branches that are invoked simultaneously. The complete structure of the service composition component consists of the above four basis structures. Every global QoS The Scientific World Journal The optimal instantiation is found? Figure 1: Flowchart of the D-GSS model.
attribute has its own aggregated method. We sort the QoS aggregated methods into three types: (1) the summation method (e.g., cost), in which the fees must be accumulated by the user to pay for invoking the services; (2) the continued multiplication method (e.g., availability), in which global availability can be computed as the product of the ratios of all atomic service availability; (3) the average method (e.g., reputation), in which global reputation is the average value of the related service reputation. We present all particulars (see Table 1) of these three methods with sequential, parallel, conditional, or loop structures. In Table 1, is a 0-1 variable. If condition is satisfied, then we define = 1; otherwise, = 0.
(iii) After the values of [ 1 , . . . , ] and [ 1 , . . . , ] are evaluated, we present a utility function to describe the relationship between and . Two types of QoS criteria are available, that is, cost and benefit. In the cost criterion, variables (e.g., response time) with higher values have lower qualities. In the benefit criterion, variables (e.g., availability) with higher values have higher qualities. The utility function synthesizes the cost and benefit criteria.

Definition 1 (utility function). Suppose that a global QoS attribute
(1 ≤ ≤ ) and its constraint of an instantiation exist, the utility function is defined as follows: 4 The Scientific World Journal If the global QoS attribute satisfies the requirement of the QoS constraint , then ( , ) ≤ 1; otherwise ( , ) > 1.
(iv) The cost function is based on the values of the utility function and the weights the user defined. The better the instantiation is, the lower the quality of the cost function result becomes.
The objective of this paper is to employ D-GSO to get the optimal solution of the following model: where ∈ .

D-GSO Algorithm
The GSO algorithm [7] designs optimum searching strategies to solve continuous optimization problems. However, service selection is a discrete problem. Therefore, we present an evolutionary algorithm named D-GSO that can handle composition components with discrete atomic services. The steps of the D-GSO algorithm are described in Algorithm 1.

Convergence Verification.
In this section, we verified the convergence of the D-GSS model. After iterations, the best instantiation with the lowest cost can be determined with the cooperation of the producer and some scroungers and rangers.

Lemma 3. If represents the space of all instantiations and
represents the space of the producer, then = .
Proof. (1) max denotes the maximum distance between two points in space . By using (3) to (6), we can equate space to a sphere that has center ℎ possessing sub( ℎ ) and radius max . Thus, ⊂ .
(2) The following strategy is employed by using the D-GSS model: when a member in space is outside space , the member will return into space by setting the variables that violated the bounds to their previous values. Therefore, ⊂ .
Theorem 4. The costs of instantiations in the group will converge to the global optimum that corresponds to the best instantiation with the lowest cost.
Proof. In the D-GSS model at the ℎth iteration, (1) the producer behaves according to (ii)-(iv) in Algorithm 1. By applying the D-GSO algorithm, we can derive the following: (2) the scroungers ℎ+1 will approach the producer through (vii) in Algorithm 1, (3) the rangers ℎ+1 will disperse from a group to perform random walks via (viii) and (ix) in Algorithm 1 to avoid entrapments in the local minima, (4) finally, we calculate the costs of all instantiations in the group and reallocate their roles. The cost of the new producer is shown as follows: cost ( ℎ+1 ) = min (cost ( ℎ+1 ) , cost ( ℎ+1 ) , cost ( ℎ+1 )) .
We conclude that cost( ℎ+1 ) ≤ cost( ℎ ) by using (6) and (7), which means that the cost of the producer is monotonically decreasing. A global optimum, which has the lowest cost in all instantiations, exists. As stated in the proof of Lemma 3, = . Therefore, the infimum of cost( ) is cost (global optimum); that is, after iterations, the instantiation converges to the global optimum.

Simulation Convergence
Results. The parameter setting of the D-GSS model is summarized as follows.
classes of generic services are present in the complex composition component, in which each class has 50 candidate services that has 10 QoS attributes. The service requestor provides 10 QoS attribute constraints as well as the weights for each QoS attribute. Overall, 51 initial instantiations with ( , ) ≤ 1 (1 ≤ ≤ 10) are selected at random in all instantiations. The initial head angle 0 of each individual is set to ( /4, . . . , /4). The constant is given by round( √ + 1). The maximum pursuit angle max is / 2 . The maximum turning angle max is set to max /2. Suppose = 10, 100; the relations between the cost of the producer and the iteration times within 500 runs are shown in Figure 2. The experimental results show that the cost of the producer always converges to the optimum of the low-or high-dimensional service composition component. Find the producer with the lowest cost in the group; Perform producing: (a) The producer will scan at zero degree and then scan laterally by randomly sampling three instantiations in the scanning field: one instantiation at zero degree, one instantiation in the right-hand side of the hypercube, and one instantiation in the left-hand side of the hypercube. 1 ∈ 1 is a normally distributed random number with mean 0 and standard deviation 1, where as 2 ∈ −1 is a uni-formly distributed random sequence in the range (0, 1); The producer will find the best instantiation where ( , ) ≤ 1 (1 ≤ ≤ ) with the lowest cost. If the best instantiation has a lower cost compared with the current instantiation, then the best instantiation will be chosen; otherwise, the current instantiation will remain and turn its head to a new randomly generated angle. max ∈ 1 is the maximum turning angle; ℎ+1 = ℎ + 2 max , ( v ) (c) If the producer cannot find a better instantiation after iterations, then the producer will turn its head back to zero degree; ℎ+ = ℎ , ( v i ) Perform scrounging: Randomly select 80% members from the rest of the instantiations to perform scrounging. The area copying behavior of the th scrounger can be modeled as a random walk toward the producer. In (vii), 3 ∈ is a uniform random sequence in the range (0, 1); sub ( ℎ+1 ) = sub ( ℎ ) + round ( 3 ∘ (sub ( ℎ ) − sub ( ℎ ))), (vii) Perform dispersion: The rest of the instantiations will be dispersed to perform ranging: (1) generate a random head angle by using (v); (2) choose a random distance from the Gauss distribution by using (viii); transform into the new instantiation by using (ix); = ⋅ 1 max , (viii) sub ( ℎ+1 ) = sub ( ℎ ) + round ( ℎ ( ℎ+1 )). (ix)

Calculate fitness:
Calculate the values of the current instantiations according to the cost function; END FOR Set ℎ := ℎ + 1; END WHILE Algorithm 1: Procedure for the D-GSO algorithm. 6 The Scientific World Journal The experiments were conducted on a PC with 2.50 GHz Intel Processor and 8.0 GB RAM. All programs were written and executed in Java. The operating system was Microsoft Windows 7.

Conclusion
In this paper, we describe a new evolutionary approach for multiconstraints service selection on the basis of the D-GSS model. We propose the cost function and the D-GSO algorithm for searching the global optimum from discrete instantiations of the service composition component. The convergence of the D-GSS model is verified via several formal proofs and simulations. This model has an outstanding advantage in terms of solving high-dimensional service composition problems. In the future, we hope to search for the global optimum under a dynamic heterogeneous environment by using the D-GSS model.