DETECTION OF THE PERMUTATION SYMMETRY IN PATTERN SETS

Symmetry is a powerful tool to reduce the freedom degrees of a system. But the applicability of the symmetry tool strongly depends on the ability to calculate the symmetries of the system. There exists an interesting algorithmic problem to search for the symmetry of a high-dimensional system. In this paper, a genetic algorithm-based permutation symmetry detection approach is proposed for pattern set. Firstly, the permutation symmetry distance (PSD) is defined to measure the similarity of a pattern set before and after being transformed by a permutation operator. Secondly, the permutation symmetry detection problem is converted into an optimization problem by taking the PSD as a fitness function. Lastly, a genetic algorithm-based approach is designed for the symmetry detection problem. Computer simulation results are also given for five pattern sets of different dimensionality, which show the efficiency and speediness of the proposed detection approach, especially in high-dimensional cases.


Introduction
Symmetry, as a powerful tool, may reveal the intrinsic relations inherent in the objects or phenomena that seem to be uncorrelated and is used widely to almost all areas of the natural science [2,6,9].The study of symmetry begins in the early stage of the neural networks research [21], and recently more and more researchers devoted themselves to this field.For example, Baldi demonstrated in his paper [1] that the global properties of an individual network could be found from symmetry considerations of the invariance group of the specific pattern set stored by some learning rules.Reimann showed in his paper [17] that the symmetry of the network structure is already determined by the symmetry of the set of test sequences, indicating that learning a set of elements applied is concerned with finding invariant relations inherent in this set.He also showed how to design the artificial autoassociative neural networks using group theoretical methods [18].All this research reveals the importance and usefulness of the symmetry method in the study of neural networks.
The applicability of the symmetry method, however, strongly depends on the potential ability of calculating the symmetry group of the system efficiently.In general, the symmetry group is usually calculated by the generators of the symmetry group of the given system itself.When the generators are unknown, one has to calculate the symmetry group by the method of exhaustive search, which is impractical in fact for a high-dimensional case because of the tremendous search space.For example, the order of the symmetric group S n grows with n!, that is, there are n! potential solutions in an n-dimensional system [18].There is, to our knowledge, no report of using an efficient algorithm to detection the permutation symmetry of a high-dimensional system.Almost all of the up-to-thedate symmetry studies in neural networks done at the low-dimensional cases because of the symmetry calculation problem.There still exists an interesting algorithmic problem when the system is sufficiently large.
We define in this paper a permutation symmetry distance (PSD) to measure the similarity of a pattern set before and after being transformed by a permutation operator, then convert the permutation symmetry detection problem into an optimization problem.There are many stochastic approaches for the optimization problem [11-13, 15, 19, 20], the genetic algorithm is a simple and effective one.We do not want to study the algorithm itself here.Our intent is to test the feasibility of such a global random search algorithm in the permutation symmetry detection problem.Instead of developing a new search algorithm, we use the standard genetic algorithms (GAs) to search for the symmetric permutation operators of a given pattern set in this paper.A permutation operator is encoded by a set of consecutive integers (chromosome).The PSD is taken as the fitness function.The three chosen genetic operators (crossover, mutation, selection) are related to those in some works of GAs for the traveling salesman problem, which have the same encoding method to our work.Results are presented for five different dimensional binary pattern sets.When compared with exhaustive search, genetic algorithms show greater efficiency, especially in high-dimensional cases.
The organization of this paper is as follows.Section 2 gives some definitions used in this paper and presents the detailed definition of permutation symmetry measure.In Section 3, we outline the search algorithm and the corresponding operators.In Section 4, we present the computer simulation results.Conclusions and discussions are given in Section 5.

Measure of the permutation symmetry
2.1.Permutation symmetry.We begin our discussion with some important definitions [3].Definition 2.1.A set V = {v 1 ,v 2 ,...,v n } consists of n elements.Then a permutation operator (or called permutation for short) of V is a bijection σ : V → V , which reindexes the set.When omitting the symbol v, the set V can be understood as a set of consecutive integers (or an order set), that is, (1,2,...,n), and the permutation can be written as ..,n}, where it is understood that σ maps 1 to i 1 , 2 to i 2 , and so forth.D. Ji-Yang and Z. Jun-Ying 3 Definition 2.2.The set of all permutation operators on the set {1, 2,...,n} is denoted by S n , which is called the symmetric group of degree n.
The order of S n is n! as is easily seen using the "two-row" method to write a permutation operator.For example, let V = {1, 2,3}, there are six permutation operators for this set, namely, The permutation operator • n is called the identity element and is denoted as e.Definition 2.3.Let x = (x i ), i = 1,2,...,n, be an n-dimensional pattern vector.The action of a permutation operator (s ∈ S n ) on x is defined as (s • x) i = x s(i) , which reindexes the components of the pattern vector.
For example, let x = (a,b,c,d), that is, Definition 2.4.Let X = {x 1 ,x 2 ,...,x m } be an m-pattern set.The action of permutation operator (s) on the set X is defined as s The set of all symmetric permutation operators of the set X is denoted by S X .S X is a group and is called the permutation symmetry group of X, or permutation symmetry of X in brief (readers desiring a more exactly definition can refer to the book [3]).
For example, let X = {(2,1,2),(2,2,1),(2,1,1)}.The permutation symmetry of X is S X = e, 1 2 3 1 3 2 , which consists of 2 symmetric permutation operators.Definition 2.5.The action of a permutation operator (s) on a matrix w = (w i j ) is defined as (s • w) i j = x s(i)s( j) , for all s ∈ S n , which reindexes the rows and columns of the matrix simultaneously.If s • w = w, the permutation operator s is a symmetric permutation operator of w.The set of all symmetric permutation operators of w is called the permutation symmetry group of w, denoted as S w .
Property 2.7.Let G = {g i } be a group, and

Measure for permutation symmetry.
Symmetry is treated as a binary feature in the exact mathematical definition (an object is either symmetric or nonsymmetric).However, the exact definition of symmetry is inadequate to describe and quantify neither the symmetries found in the natural world nor those found in the visual world.For example, we say that the equilateral triangle has "more" symmetry than the isosceles triangle.Thus, although symmetry is usually considered a binary feature, we view symmetry as a continuous feature where intermediate values of symmetry denote some intermediate "amount" of symmetry.This concept of continuous symmetry is in accord with our perception of symmetry as can be seen in [22].Zabrodsky et al. [22] present a concept "symmetry distance" to measure the symmetry in an image.But they treat the shape of an image as a point sequence, that is, an orderdependent points set.So the symmetry distance is limited to measure the symmetries of a sequence.There are no reports of the continuous measure of the permutation symmetry of a pattern set which is order independent to its patterns.
In this section, we aim to define a measure for permutation symmetry in a pattern set.Let a set ..,m} contain m patterns, and let each pattern be an (2.3) For example, the pattern set V = {(2,3,2),(2,2,3)} has 2 patterns.According to formula (2.3), the out-product matrix of V would be (2.4) We have the following theorem.
Theorem 2.8.Let S V be the permutation symmetry group of the pattern set V = {v k i }, and let W be the out-product matrix of V , then all permutation operators in S V are also the symmetric permutation operators of W, that is, (2.5) Proof.We prove this theorem by contradiction.Assume there is a permutation operator s of S V , and s is not the symmetric permutation operator of W, that is, (2.6) D. Ji-Yang and Z. Jun-Ying 5 Assume We can rewrite (2.3) as where (v k ) t is the transposed vector of v k .So that is, the permutation operator s is a symmetric permutation operator of W. It is opposite.So all permutation operators in S V are also the symmetric permutation operators of W.
Theorem 2.8 implies S V ⊆ S W .In fact, S V = S W can easily be satisfied by modifying the out-product matrix as where α is a real parameter.In general, S V = S W when v k i + α ≥ 0 for any i and k.So we can search for the permutation symmetry of a given pattern set by its outproduct matrix.
Let W be the out-product matrix of a given pattern set V , and let W s be the permutated out-product matrix by the permutation operator s.We define the permutation symmetry distance (PSD) of the permutation operator s on the given pattern set V as a quantifier of the difference between W and W s , that is, (2.10) We can normalize by scaling the matrix element W i j so that the maximum PSD of a permutation operator is a given constant, for example, 1 or n.Thus the PSD value is limited in range, where PSD(s) = 0 for perfectly symmetric permutation operator s.What we do in this work is to search out for all the permutation operators, whose PSD are equal to 0, of a given pattern set.
In the next section, we describe the genetic algorithm for searching for the symmetric permutation operators of a pattern set.

Genetic algorithm-based approach
Genetic algorithms are search algorithms based on the mechanics of natural selection and natural genetics and are used to search for large, nonlinear search spaces where expert knowledge is lacking or difficult to encode and where traditional optimization techniques fall short [8].The basic principles of GAs were first designed by Holland [10].They work with a population of individual strings (chromosomes), each representing a possible solution to a given problem.Each chromosome is assigned a fitness value according to the result of the fitness function.Highly fit chromosomes are given more opportunities to reproduce and the offspring share features taken from their parents.
The permutation symmetry detection problem gives all the symmetric permutation operators, whose PSD are equal to 0, of a given pattern set.We know that a pattern set often has more than one symmetric permutation operator.That is, permutation symmetry detection is a multioptimization problem.We aim to use the genetic algorithm to find as many different symmetric permutation operators as possible in a single run.So we modify the standard genetic algorithm [23] as follows.
(1) Initialization: a starting population with N individuals is (randomly) generated.
(2) Evaluation: every individual of the initial population is evaluated.
(3) Recombination: relatively "fit" individuals are selected for recombination.Then a new generation with N parents and N children is created using crossover and mutation.(4) Evaluation: these new individuals are evaluated.
(5) Save the symmetric permutation operators: save the individuals whose fitness values are 0, then replace them with individuals randomly generated.( 6) Selection: choose the best N individuals to propagate to the next generation using the CHC selection [7].(7) Termination check: if a given amount of time (a number of generations) has elapsed, the algorithm stops.Otherwise, it goes back to step (3) and continues.We present the algorithm operators in detail as follows.

Representation of the chromosomes.
Coding all possible solutions into different chromosomes is a key problem of genetic algorithms.The chromosome often is a binary string in traditional coding, but the binary coding scheme is not suitable for our problem.
A permutation operator can be written in the form of "two rows," for example, , when omitting the top line, the permutation operator is written as where it is understood that the permutation operator maps 1 to i 1 , 2 to i 2 , and so forth.For example, the chromosome [3 4 1 2] stands for the permutation operator 1 2 3 4  3 4 1 2 .This coding method is used widely in GAs for TSP [7,8,14].For an n-dimensional pattern set, we initialize the population by randomly placing 1 to n into n length chromosomes and guaranteeing that each number appears exactly once.Thus chromosomes stand for legal permutation operator.
to guarantee that the best individual will always survive in the next generation [7].In CHC selection, if the population size is N, we generate N children by using roulette wheel selection, then combine the N parents with the N children, sort these 2N individuals according to their fitness value, and choose the best N individuals to propagate to the next generation.To prevent convergence to a local optimum, we save the top 10% individuals and reinitialize the rest of the population randomly if the population has converged [14].

Experimental results
In order to test the effectiveness of the genetic algorithms described above, we performed experiments on data with different dimensions and with different parameter values on a DELL computer (Optiplex G1, C400/128M SDRAM/Windows 2000 Server).The program is coded in Visual C++ 6.0.Here we present some experimental results (Table 4.1-Table 4.6) of some pattern set, in which the number of the symmetric permutation operators is known in advance, that is, the test sets have D n -symmetry which has 2n (n is the dimension of the pattern) symmetric permutation operators [4,5].To restrict length, we just list three test pattern sets (n = 8,9,10) as follows.
In Tables 4.1-4.6,"m" is the number of symmetric permutation operators found by the program, "time" is the corresponding search time in seconds, "u" is the population size, "maxgen" is the maximum iteration generation.Table 4.1 shows the results of exhaustive search for 6 different pattern sets.Tables 4.2-4.6 show the results of our approach with different parameter values for 5 different pattern sets, respectively.The crossover and mutation rates are 0.7 and 0.01, respectively, in all experiments.D. Ji-Yang and Z. Jun-Ying 9

Conclusions and discussions
From the simulation results shown in Tables 4.1-4.6,we can see the following.
(1) All of the symmetric permutation operators of a pattern set can be found by exhaustive search in theory, but it is so time consuming that it becomes impossible in fact for a high-dimensional case.This is seen from Table 4.1.The program requires more than 12-times search time to search for the symmetric permutation operators when the dimension of pattern set increases one.The search time is about 1.2 days for n = 13.At this rate, the program has to spend at least 5.5 years to find out all the symmetric permutation operators of a 16-dimensional pattern set.It is obvious that exhaustive search can only be used in small-dimensional case in practice.
(2) Our approach is not superior to exhaustive search in small-dimensional cases, that is, n < 10.However, when the dimension of the test pattern set is bigger than 10, our approach is faster than exhaustive search.For example, n = 12 in Table 4.4, genetic algorithm can find out 21 symmetric permutation operators in 30 seconds, but the exhaustive search has to spend 8379 seconds to find out all the 24 symmetric permutation operators (see Table 4.1).The most important advantage of our approach is that the search time increases slower than the exhaustive search as the dimensionality of the pattern set increases.For example, when the dimension n = 16 (see Table 4.6), the program can find out the majority of symmetric permutation operators in 169 seconds, it is less than 3 times of that for n = 14.Of course, there is a shortcoming in our approach, for example, there are not any criteria to know whether all of the symmetric permutation operators have been found, so the search results are often incomplete.However, the permutation symmetry of a pattern set is a group, so we can apply Properties 2.6 and 2.7 to obtain more or even all the symmetric permutation operators.
(3) The total computational time needed is the product of single-measure time and the evaluation times needed in the global search algorithm.There are many other global optimization algorithms which may be better than the traditional genetic algorithms described in this paper, for example, immune algorithms [12] and quantum algorithms [20].This paper illustrates with the traditional genetic algorithms how global optimization algorithms can be applied to search for the permutation symmetry of a pattern set.
(4) Moreover, as to effectiveness, our approach is relatively similar to the genetic algorithm for TSP, which is well documented, because both problems have the same genetic operations and representation.The dimension of pattern set corresponds to the city number in TSP, which can be several thousands [16].
(5) There are many algorithms to solve the optimization problem, for example, simulated annealing [19] and evolutionary strategies [15].In recent years, more and more efficient algorithms have been presented, such as immune algorithms [12] and quantum algorithms [20].Many of those algorithms are proven to be more effective than the standard genetic algorithms in all kinds of optimization problems.It is believed that it would be more effective when such algorithms are utilized instead of the standard genetic algorithm used in this paper for the permutation symmetry detection problem.
In this paper, we define a measure of permutation symmetry that transforms the symmetry detection problem to an optimization problem, which is the main contribution of this paper, and show how the genetic algorithms can be applied to detect the permutation symmetry of a given pattern set, which overcomes the computing complexity of permutation operators search and makes it possible to study the high-dimensional system with the symmetry tool, for example, designing of artificial neural networks. 1

Table 4 .
1. Search results of 6 different dimensional pattern sets by exhaustive search.

Table 4 .
2. Search results of 5 different parameter values for an 8-dimensional set by our approach.

Table 4 .
3. Search results of 5 different parameter values for a 10-dimensional set by our approach.

Table 4 .
4. Search results of 5 different parameter values for a 12-dimensional set by our approach.

Table 4 .
5. Search results of 5 different parameter values for a 14-dimensional set by our approach.

Table 4 .
6. Search results of 5 different parameter values for a 16-dimensional set by our approach.