^{1}

^{1}

^{1}

Particle swarm optimisation (PSO) is an innovative and competitive optimisation technique for numerical optimisation with real-parameter representation. In this paper, we examine the working mechanism of PSO in a principled manner with forma analysis and investigate the applicability of PSO on the permutation problem domain. Particularly, our derived PSO schemes are empirically studied based on the quadratic assignment problem (QAP) benchmarks to justify its comparable performance, which in turn implies the benefits of our approach in applying PSO to the discrete problem domain.

PSO was
originally designed as a numerical optimisation technique based on swarm
intelligence. In the literature, there are a few attempts to exploit its usage
in the discrete problem domain [

In this paper, we aim to design PSO operators for
permutation problems without losing the underlining principles of the original
PSO. A PSO operator template will be formally defined with

PSO was
initially introduced by Kennedy and Eberhart [

For each generation, the particle compares its current position with the goal (global best/personal best) position, adjusting its velocity accordingly towards the goal with the help of the explicit memory of the best position ever found both globally and individually.

The most popular formulation of how particle adjusts
its velocity and position [

Adapting standard PSO to permutation problems has been a rather interesting task, as researchers are curious about its performance in the discrete domain. In this paper, we suggest that forma analysis gives a possible solution to achieve such task in a principled manner.

The initial aim of forma analysis [

Figure

Illustration of the methodology based on forma analysis.

Some of the characteristics and operator templates
related to forma analysis [

The key concept
is that of a basis: a set of equivalence relations that allows us to describe
the search space

A
subset

An encoding can thus be derived by taking the image of the basis equivalence classes corresponding to a particular solution in the search space.

Forma analysis can derive operators that explicitly manipulate the given equivalence relations. This is achieved by combining the basis with domain independent operators for specifying operator behavior in terms of basis.

One such (domain independent) operator template, which
is related to the work presented in this paper, corresponds to the (

The other operator template, random transmitting
recombination (RTR) [

Although the
above concepts of forma analysis are developed under genetic algorithms, it has
been shown that the forma analysis methodology itself is generalisable to other
evolutionary optimisers based on (theoretically) any problem domain from the
knowledge-based system (KBS) design standpoint [

As previously
studied in [

For
position-based description, each position in the permutation

In addition, an induced feasibility constraint for
this description

Given any two equivalence relations

A
direct implication of this constraint is that the

The distance metric for this formal description is
simply the number of positions in the permutation that have different elements
(i.e., the

For
precedence-based description, a set of basis precedence equivalence relations

Given any two equivalence relations,

The distance metric can be specified as the number of
different precedence relations between two solutions. For example, the distance
between permutation

In practice,

For adjacency-based
description, a set of basis adjacency equivalence relations

In addition, the feasibility constraint

Given an equivalence relation

The distance between any two solutions in the search
space under adjacency basis is thus calculated as the number of different edges
that they possess. For instance, the distance between permutation

However, on the “phenotypical” level this

The
generalisation of PSO is not as straightforward as some other optimisation
techniques, such as the generalisation of Differential Evolution [

By observing the update equation (as shown in (

By revealing
the fact that velocity is the distance between the previous and current
positions of the particle, we can define the operator template (under the basis

Given a current position

In the context
of real-vectors, the accumulation of distances is straightforward to
understand. However, for permutation problems the consideration of
“directionality” becomes rather complex from the practical standpoint. By
taking into account the fact that forma distance includes domain-specific
distance magnitude and direction which cause certain difficulties in the
context of permutation problems, a reasonable interpretation of the PSO
operator template is required to facilitate the derivation of suitable PSO
operators for permutation problems. From this perspective, the original PSO
operator template (before interpretation), which abstracts how solutions are
manipulated, can be regarded as an operator design guideline embedded with the
PSO working mechanism. As a matter of fact, various interpretations and
approximations have also been made in the previous work of forma analysis for
the purpose of facilitating operator derivations [

By understanding the fact that the perturbation of the current individual is jointly decided by three components (with their degrees of influence distributed proportionally), we can give a stochastic interpretation of the PSO operator template as follows.

The perturbation

if

else if

else,

The decomposition of the flying dynamics of a particle
is illustrated in Figure

Decomposition of the flying dynamics of PSO.

However, the mixing effect of several distances with different directions is hard to represent in the context of permutation. Modelling the accumulation from a stochastic perspective helps us avoid this unnecessary complication.

Given that we already have the mechanism to separate each distance component, the next question is how to incorporate direction to guide our PSO operator so that the particles can converge towards superior records.

As aforementioned, if a greedy component with superior
record considered is selected to perturb the current particle, we directly
have

By taking a closer look at the guided PSO operator, we
can actually find that the effect of perturbing one individual towards another
is the same as making a crossover (e.g., RTR) between these two individuals

From the above
discussion, we can evolve a new interpretation of the PSO operator template.
The new position

if

else if

else,

In this
section, our forma analysis framework is compared with the geometric framework
[

On the
framework level, both frameworks aim to generalise PSO based on underlying
optimisation components (e.g., solution representation or distance notion) so
that the abstraction of optimisers, either in the form of standard operator
template RTR/RRR/RAR or in the form of line segment/ball [

The main difference on the design concept level lies
in the choice of such abstractions and the “carrier” of domain knowledge. For
our forma analysis approach, the solution representation is generalised with
equivalence relations/classes so that formal representation can be defined in
an unified manner, while operators that manipulate the solutions are abstracted
as operator templates that process equivalence relations. In contrast, the
geometric framework is more about generalising optimisers based on a

On the practical PSO generalisation level, both approaches are generally different in two places as well.

First of all, the concept of velocity has been removed from the geometric framework (thus, including the simplification of the concept of inertia as a component in geometric crossover), while a random mutation is added to the geometric PSO as a potential replacement for perturbation purposes. In our forma analysis approach, velocity has been interpreted and formulated as distance (more precisely forma distance) in the previous time step. However, velocity itself is a rather complicated concept to formulate as it involves the interpretation of both magnitude and direction which are hard to represent in the context of permutation problems. Certain simplifications and compromises have been made to maintain this concept for future research.

Secondly, the accumulation of greediness toward
personal best and global best, balanced by previous velocity (or position), is
interpreted differently as well. In geometric PSO, multiparental geometric
crossover is used to linearly recombine these positions to produce the next
position with different weights taken into account through the concept of
product geometric crossover. In contrast, in our forma analysis approach,
different convergence components are treated stochastically according to their
different weights where higher weight represents higher probability of being
treated as the convergence direction for the next time step (and vice versa).
By looking at the full picture, different components (personal best, global
best, or previous velocity) all share the probability of being selected to
guide the next move. Standard RTR operator template is used to converge towards
superior solutions with the direction of distance naturally maintained, while
standard

As mentioned
earlier, the formal descriptions of permutation problems implicitly introduces
some feasibility constraints to produce a valid solution. When we design PSO
operators for permutation problems, these constraints must be satisfied (or handled properly) which is effectively
a subproblem to solve. (Of
course, these constraints only exist if we are only interested in searching
feasible regions, while search techniques making use of infeasible regions are
out of the scope in this discussion.) In our previous work [

According to the aforementioned stochastic
interpretation of PSO operator template, the outcome is effectively a blended
operator with three different “phases”:

For the
position-based description,

The most straightforward thought would be to randomly
select

The CSP we consider here is defined as the

Given

Then, what we need to do is simply to reinstantiate
these

In fact, the working mechanism of

For the
precedence-based description of permutation, the distance of two permutations
is the number of different precedence relations

Assuming

Illustration of reinstantiation of a permutation based on
precedence-based description. Symbol “

Given that

As shown in Figure

By observing the effect of changing a single
precedence equivalence class, it is not difficult to find that a

Since each potential edge is defined as an equivalence relation for the adjacency-based description of permutation, the distance of two permutations corresponds to the “edge-difference” between them.

For the simplicity of illustration, we assume that

By solving the CSP to generate

The minimal mutation implied by the adjacency-based
description is an edge

For the
position-based description

For example, given

The reinstantiation of

To transmit position features from parents to children
and interpret the feasibility constraint

In PTX, both

Illustration of the constraint graph of PTX.

The construction of the constraint graph is
straightforward to understand—a value that has been taken for one position
must be forbidden (constrained) for another position. For example, for position

The only possibility that the value taken by one
position does not constrain the value taken by another is that the parents both
take the same value for that position (e.g.,

Thus, as long as the constrained positions are
transmitted all-together to the child, PTX always satisfies both

In fact, PTX works in an equivalent manner as

To achieve
transmitting in precedence-based crossover, both transmitting

Furthermore, precedence relation is special in that
its equivalence class is either

For example, given permutations

In fact, strictly transmitting crossover is also
possible for precedence-based description. Due to the fact that the reinstantiation
process of precedence relations is equivalent to the

In the literature,

Many readers may find that this is rather similar to the most popular algorithm used for topological sorting where the order can be completed by starting from the node(s) with no incoming edges. Switching between two parents simply aims at recombining the precedence equivalence classes of the two parents.

It is also easy to find that the set of all possible solutions produced by PPX is in fact a subset of the set of solutions produced by the above CSP approach. In other words, for each of the solution produced by PPX, there is always a corresponding reinstantiation of the partial child permutation.

Regarding the
adjacency-based description of permutation which has been proved to be non

Furthermore, for those edges which are absent in both
parents (“negative edges”

Thus, approximation through relaxation of

In summary, the
derived blended PSO schemes with different formal descriptions of permutation
can be described in the following Algorithms

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

To illustrate
the search dynamics of the derived blended PSO schemes, we evaluated the
performance of these PSO schemes on the quadratic assignment problem (QAP)
benchmarks. However, due to the fact that for QAP the absolute positioning of
element is more related to the quality of solution [

After a brief description of the problem formulation, we show both the experiment configurations and the experimental results, followed by a few discussions to help the understanding of the benefits of our approach.

The quadratic
assignment problem (QAP) is an important problem in both practice and theory.
Many practical problems can be formulated as QAPs [

The benchmarks
for this experimental study are acquired from QAP-LIB [

For each of the instances, fine tuning is carried out for each of the algorithms to reach its best performance among different combinations of parameter settings with equal number of generations. The parameter settings with the best performance over 20 independent runs for each algorithm will be used to get the execution results. (The performance is evaluated by considering both its average best solution found and its average number of generations to reach its best solution. The number of generation to reach its best solution is only considered when two parameter settings have the same average best solution.)

The original free parameters for each instances are

The population size is fixed to be an appropriate
number (

According to
the tuned parameter settings,

To examine the
search behavior of the proposed PSO schemes, we track three components of each
scheme that are felt to be essential to the search dynamics of PSO. These three
components are: the

Experimental results over 50 independent runs for different PSO schemes.

SIZE | GEN # | POP_size | Tai40a | SIZE | GEN # | POP_size | |
---|---|---|---|---|---|---|---|

20 | 5000 | 100 | 40 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS | PSO_POS | ||||||

PSO_PREC | 756756.8125 | 6750.588867 | PSO_PREC | 3475673 | 23905.56437 | ||

PSO_ADJ | 778401.8125 | 10900.51661 | PSO_ADJ | 3547892 | 17521.19283 | ||

Nug20 | SIZE | GEN # | POP_size | Sko56 | SIZE | GEN # | POP_size |

20 | 5000 | 100 | 65 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS | PSO_POS | ||||||

PSO_PREC | 2717.847864 | 27.325028 | PSO_PREC | 36078.96543 | 257.68971 | ||

PSO_ADJ | 2746.896391 | 25.121398 | PSO_ADJ | 38423.78542 | 375.67864 | ||

Bur26a | SIZE | GEN # | POP_size | Ste36a | SIZE | GEN # | POP_size |

26 | 5000 | 100 | 36 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS | PSO_POS | ||||||

PSO_PREC | 5462411.9 | 8622.741697 | PSO_PREC | 11356.24567 | 553.488941 | ||

PSO_ADJ | 5489783.6 | 7989.741787 | PSO_ADJ | 11667.04213 | 604.555.718 | ||

Tai20b | SIZE | GEN # | POP_size | Tai40b | SIZE | GEN # | POP_size |

20 | 5000 | 100 | 40 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS | PSO_POS | ||||||

PSO_PREC | 125760656 | 1622274.93 | PSO_PREC | 706244099 | 17986929 | ||

PSO_ADJ | 128727848 | 1379135.25 | PSO_ADJ | 722001585 | 18978233 |

The mean cost, average best cost, and average velocity
of the population over generation for

Through observation shown in
Figure

The situation for

This can be mainly explained by the different
information transfer efficiencies with different descriptions for QAP. The
position-based PSO scheme (

However, the information processing of absolute positioning is disrupted by both precedence description and adjacency description to different degrees. This is also quite obvious from the implementation standpoint, since the “recombination” of precedence/adjacency information certainly will not produce the convergence of solution quality efficiently in terms of absolute positioning. The degree of such “disruption/deviation” is mainly decided by its correlations to the positional description. This can be further illustrated by the fact that precedence-based PSO performs better than adjacency-based PSO. As a matter of fact, precedence relations are more correlated to positional relations, which can be easily understood by inspecting the shift operator—as the number of precedences changed by the shift operator increases, so does the number of positions in a smooth progression. In contrast, adjacency relations are found to be poorly correlated with positional relations, since the number of adjacency relations changed by edge-reverse mutation is poorly correlated with the changes in the absolute positioning of permutations.

The above results also reflect the main argument we
are making for the methodology in this paper: the search behavior and
performance of the derived operator depend on the description for the specific
problem. Further estimations can be that

In addition,
another PSO scheme for QAP is also implemented with

Comparison of average best cost and velocity of
the population over generation for

The mean-best and standard deviation produced by our
PSO schemes for each instance are presented
in Table

Experimental results over 50 independent runs for PSO schemes with different topologies against a standard GA.

SIZE | GEN # | POP_size | Tai40a | SIZE | GEN # | POP_size | |
---|---|---|---|---|---|---|---|

20 | 5000 | 100 | 40 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS_G | 749292.1871 | 12172.79199 | PSO_POS_G | 3332304 | 27611.47461 | ||

PSO_POS_R | 737858.2491 | 9983.63476 | PSO_POS_R | 3307909 | 28703.17578 | ||

GA | GA | ||||||

Nug20 | SIZE | GEN # | POP_size | Sko56 | SIZE | GEN # | POP_size |

20 | 5000 | 100 | 65 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS_G | 2680.120117 | 34.834686 | PSO_POS_G | 35955.64063 | 309.11557 | ||

PSO_POS_R | 2629.800049 | 24.587896 | PSO_POS_R | 35452.19922 | 179.47287 | ||

GA | GA | ||||||

Bur26a | SIZE | GEN # | POP_size | Ste36a | SIZE | GEN # | POP_size |

26 | 5000 | 100 | 36 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS_G | 5445906.1 | 8688.298828 | PSO_POS_G | 10808.51953 | 436.841125 | ||

PSO_POS_R | PSO_POS_R | 10252.40039 | 235.487991 | ||||

GA | 5436641.5 | 5282.220703 | GA | ||||

Tai20b | SIZE | GEN # | POP_size | Tai40b | SIZE | GEN # | POP_size |

20 | 5000 | 100 | 40 | 10000 | 100 | ||

Algorithms | Mean best | Std Dev | Algorithms | Mean best | Std Dev | ||

PSO_POS_G | 124803488 | 1190690.75 | PSO_POS_G | 696782272 | 28186792 | ||

PSO_POS_R | PSO_POS_R | ||||||

GA | 125778728 | 5234746.11 | GA | 679525504 | 18660756 |

From the results, we can see that

In this paper, we have presented how the original PSO operator can be generalised in a formal manner to the permutation problem domain using forma analysis, with both the formal descriptions of permutation and a stochastic PSO operator template defined. By considering the application of operators as a process of constraint satisfaction, we derived several concrete PSO schemes for permutation problem, each of which involves a different assumption made on the description of the search space. Through observations of the search patterns of the derived PSO schemes together with the ring structured extension of position-based PSO on the QAP benchmarks, it is clear that the description choice is a critical issue in operator design, and the position-based PSO scheme for QAP achieves a certain degree of convergence towards the optimum in a similar manner as the traditional PSO for real-vector space, with results comparable to a standard GA.

More importantly, we have presented in this paper a principled approach to formally derive algorithms with regard to the actual problem domain, in which case the behaviors and the performance of the derived algorithms are directly related to the assumption we make to describe the search space.

In the future, efforts on the improvement of these
discrete PSO schemes are possible by considering additional issues (e.g.,
topological search control, local search, and even parameter selections).
Application of our methodology to a wider range of problems and optimisation
techniques can also be explored. In addition, the interpretation of applying a
directed