In 2012, Moraglio and coauthors introduced new genetic operators for Genetic Programming, called geometric semantic genetic operators. They have the very interesting advantage of inducing a unimodal error surface for any supervised learning problem. At the same time, they have the important drawback of generating very large data models that are usually very hard to understand and interpret. The objective of this work is to alleviate this drawback, still maintaining the advantage. More in particular, we propose an elitist version of geometric semantic operators, in which offspring are accepted in the new population only if they have better fitness than their parents. We present experimental evidence, on five complex real-life test problems, that this simple idea allows us to obtain results of a comparable quality (in terms of fitness), but with much smaller data models, compared to the standard geometric semantic operators. In the final part of the paper, we also explain the reason why we consider this a significant improvement, showing that the proposed elitist operators generate manageable models, while the models generated by the standard operators are so large in size that they can be considered unmanageable.
In the original definition of Genetic Programming (GP) [
The paper is organized as follows: Section
Even though the term semantics can have several different interpretations, it is a common trend in the GP community (and this is what we do also here) to identify the semantics of a solution with the vector of its output values on the training data [
As Moraglio et al. point out, these operators create much larger offspring than their parents and the fast growth of the individuals in the population rapidly makes fitness evaluation unbearably slow, making the system unusable. In [
Geometric semantic operators have a known limitation [
Even though in [
If we iterate this reasoning, performing the crossover between two individuals belonging to the population at generation 2, the offspring has the following shape:
Moraglio and coauthors have an interesting discussion about code growth in [
To partially counteract this problem we propose the following replacement method: Considering two parents Considering an individual
While the idea is quite simple, it is interesting to point out that the proposed method is supposed to be useful if the geometric semantic genetic operators tend to produce a high number of individuals whose fitness is not better than the fitness of their parents. Hence, before testing the proposed method, it makes sense to perform an experimental analysis aimed at understanding how many crossover and mutation events produce an offspring with a better fitness than the parents. In order to do that, we considered five real-life applications: three applications in the field of drug discovery that are becoming widely used benchmarks for GP [
For this experimental study, we use the same experimental settings considered in [
Results of this analysis are reported in Figure
Success rate of the crossover and mutation events. For each one of the five considered problems (from top to the bottom: %F, PPB, LD50, concrete, and Parkinson), three different mutation steps have been considered. From left to right,
It is worth pointing out that this last situation is ideal for testing the effectiveness of the proposed growth control method. In fact, the proposed technique will be applied in
In this section, we analyze the training and test performance of GSGP with the proposed elitist replacement technique, comparing it to standard GSGP. The results of this study are reported in Section
The plots shown in this section report as fitness the root mean square error between target and obtained values on training and test data. All the results have been obtained by considering
Figure
Training (plots a, b, and c) and test (plots d, e, and f) fitness for the %F problem, considering different mutation step values.
The same observations can be drawn considering the PPB dataset (results reported in Figure
Training (plots a, b, and c) and test (plots d, e, and f) fitness for the PPB problem considering different mutation step values.
For the third considered problem, that is, the LD50 dataset (results reported in Figure
Training (plots a, b, and c) and test (plots d, e, and f) fitness for the LD50 problem considering different mutation step values.
For the concrete dataset (results shown in Figure
Training (plots a, b, and c) and test (plots d, e, and f) fitness for the concrete problem considering different mutation step values.
Figure
Training (plots a, b, and c) and test (plots d, e, and f) fitness for the Parkinson problem considering different mutation step values.
To summarize all of these results, we point out that, for all the studied applications, the two GP systems perform differently (in some cases GSGP outperforms the elitist method and in other cases vice versa) when small values (i.e.,
The differences between GSGP and the proposed elitist method, observed when mutation steps equal to
Example of crossover that generates an offspring whose fitness is not better than the fitness of both the parents.
While it is computationally expensive to calculate the exact size of each tree in the population, it is possible to perform a theoretical study that, with a good approximation, allows us to gain some information about the average size of the individuals at a certain generation. Let us consider again the definition of geometric semantic crossover and mutation. In particular let us consider the structure of the individuals that are created by the semantic operators. Starting from
Individuals generated by the geometric semantic crossover (a) and by the geometric semantic mutation (b).
In order to give experimental corroboration to this finding, in Figure
Size of the best model after 1000 generations. Median calculated over 30 runs.
Size | ||
---|---|---|
GSGP | Elitist GSGP | |
%F |
|
|
PPB |
|
|
LD50 |
|
|
Concrete |
|
|
Parkinson |
|
|
Evolution of the median size of the best individuals in the population calculated over
Once established, both theoretically and experimentally, that the proposed elitist method maintains populations of smaller individuals than GSGP, it is interesting to discuss if the size of those individuals is “usable” or as it typically happens for GSGP, the individuals are still too big to be managed. To answer this question, we use the following argument: in his first book on GP, Koza established a fixed tree depth limit of the individuals in the population equal to
In conclusion, the proposed elitist method outperforms standard GP and obtains results that are qualitatively comparable to the ones of GSGP, but, contrary to what happens for GSGP, it is able to maintain populations composed of individuals of manageable size.
Recent work in Genetic Programming (GP) has been dedicated to the definition of methods based on the semantics of the solutions. Among the existing semantic-based methods, one of the most recent methods is based on the definition of particular genetic operators, called geometric semantic genetic operators, that have precise consequences on the semantics of the individuals. This GP variant, known as Geometric Semantic GP (GSGP), has shown very interesting results for a vast set of complex real-life applications in several domains, consistently outperforming standard GP on all of them. Nevertheless, an important problem affects GSGP: the geometric semantic operators, by construction, generate individuals that are larger than their parents, leading rapidly to unmanageable populations unless very specific implementation is used. Also, the very big dimension of the individuals makes it difficult to read and understand the final solution, practically turning GP into a black box system. To limit this important drawback of GSGP, in this paper we proposed a method (called elitist system) in which a newly created individual is accepted as a member of the new population only if it has a fitness that is better than the fitness of the parents. A preliminary experimental analysis, presented in the first part of the paper, has shown that in GSGP several applications of the genetic operators do not produce offspring that are better than their parents (in particular for crossover). This fact has encouraged us to pursue the research and implement the proposed elitist system. The experimental results that we have presented in the central part of the paper have shown that the proposed elitist system produces individuals of comparable quality to the ones obtained with standard GSGP. The final part of the paper was dedicated to a comparison between the size of the individuals maintained in the population by the proposed elitist method and the ones of GSGP. This study, besides confirming, as it was expected, that the proposed elitist method creates smaller individuals, has also indicated that the individuals evolved by the elitist method maintain a manageable size at least until generation
A lot of future work is planned on this research track, with the final objective of defining a GSGP system able to maintain the same geometric properties of the current one, but in which individuals do not steadily grow during the evolution. If, on the one hand, it is important to further test the elitist method proposed in this paper on several other applications, possibly in conjunction with several other improvements, on the other hand simplification methods aimed at maintaining optimized expressions in the population deserve investigation. Extending the achievements obtained so far by GSGP on symbolic regression to other kinds of application is also a priority. Geometric semantic genetic operators for different applicative domains, like Boolean problems and classification, were already defined in the original work of Moraglio and coauthors and have been later further refined by the same authors. We are currently working toward the definition of geometric semantic operators for applications in the field of pattern reconstruction, like the artificial ant on the Santa Fe trail. Preliminary experimental results seem to indicate that the proposed elitist method may be particularly useful for these new operators. Last but not least, the most ambitious task of this research track remains to be the definition of new geometric semantic operators that, while maintaining the same geometric properties of the current ones, do not create individuals that are larger than their parents. An important first step has already been taken by Moraglio and collaborators, with the definition of such operators for the particular domain of basis functions. An extension of this result to functions of any possible shape is one of the main objectives of our current research.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by national funds through FCT under Contract MassGP (PTDC/EEI-CTP/2975/2012), Portugal.