| BE Home | BE Publications |
The sub-field of genetic programming is no exception to this pattern and this development can be seen in the "Advances in Genetic Programming" series. Thus, as the series has gone on, one can see an increasing concern with the implementational details necessary to make GP-based algorithms work on real problems as well as more effort in the analysis, understanding and formalisation of the processes that result from the algorithms in action.
I now turn to each volume in turn, summarising the content and highlighting chapters that I found particularly important or interesting. Each of these volumes includes an introduction and some indication of other sources on GP. This is helpful to have, but lists of sources inevitably date quickly and the reader would be better advised to consult one of the web-sites on GP (e.g. http://www.genetic-programming.org/#anchor234889) or William Langdon's bibliography (http://liinwww.ira.uka.de/bibliography/Ai/genetic.programming.html).
To complement this strand there are also a fair number of articles which discuss and extend the basic paradigm of GP. These include two (Chapter 11 and Chapter 20) which discuss the co-evolution of populations in different ways.
The suggestion here is that the population is divided into two: the "civil servants" who are the fittest programs in solving the target problem and the "pygmies" which are the rest who are allocated fitness using a function which also includes an element of parsimony (hence the name). The new population is generated solely by crossing civil servants with pygmies. This seems to maintain almost maximal variety while at the same time being very efficient with regards to storage space. The author states that this algorithm was born out of "...a reaction to the over control of genetic algorithms" (p. 261).
Here two populations co-evolve competitively with each other. One population is trying to generate sequences that the other population can not predict. The result is an algorithm for continually improving the generation of pseudo-random sequences. Thus, not only are the sequences developed to be difficult to predict but also the algorithms that generate them. This is not important only as a technique but also because it echoes wider investigations into such competitive co-evolutionary dynamics.Two of the chapters look at the possibility of splitting the population into localities. Chapter 7 discusses the use of demes for a particular problem, the other is chapter 8.
Chapter 8 introduces an element of locality into the population. Thus, instead of there being a certain number of fixed "demes" the demes are allowed to emerge. Different locations tend to specialise in different kinds of solution and the more successful of these then diffuse out among the rest of the population. The whole chapter marks a shift towards a population view of the problem solving rather than being represented by the best individual. This is one of the few chapters that explicitly imports lessons learnt in evolutionary biology.Two papers address structures other than trees: one based on finite automata (Chapter 9) and one on neural networks (Chapter 24).
One of the delights of these three volumes is in following Teller's research as it progresses. This goes in a different direction to most of the other research in GP, being based on a network of nodes, representing finite automata which have access to indexed memory. This allows the population to evolve their own kinds of data structure.
Gruau proposes a method called "Cellular Encoding" which encodes a graph as a formal grammar. This is used to enable the evolution of general graphs (including those of neural networks). This opens up the use of a developmental stage between the genotype upon which the operators of variation act and the network which is applied to the problem to reveal the phenotype. Gruau's cellular encoding is now used by Koza's group in their evolution of electronic circuits (Volume 3, Chapter 6).Other topics covered in this volume include: the application of GP to lower-level languages (Chapter 14), the use of noise during evaluation to promote the evolution of robust solutions (Chapter 10) and some applications (Chapters 16, 17, 19, 21, 22 and 23). Being the first in this series it also comes with a reasonable introduction to the field. Altogether this volume is the most exciting to read, with a heady mix of new ideas and paradigms. On the minus side it is short on practical advice.
This is one of the few papers in any of the three volumes that displays an interest in when GP can be applied successfully. It compares a variety of techniques taking elements from GP, hill climbing and simulated annealing across five different problems. The authors say "A general observation... is that none of them is always superior to the rest." p. 40. However they quickly return to trying to generically improve the algorithms, suggesting that simulated annealing based techniques can be improved with a GP-style crossover operator and GP techniques improved by adding hill climbing.
In the second volume there are several chapters (3, 4, 5, 18 and 19) about making genetic operators (particularly crossover) more intelligent. In a sense this is a strange thing to do; half the point of evolutionary techniques is their robustness to a wide range of problems which seems to come from the fact that by blindly but efficiently searching using an adequate evolutionary process will find some acceptable solutions. Any move to make the technique better at a particular set of problems is likely to increase its brittleness. In particular GP excels because it maintains the variation in its population to a considerable extent whilst it evolves its solutions. This is because the traditional GP crossover operation preserves many of the subtrees. Yet in all these chapters there was no mention of the effects on population variation. Therefore one has little indication of whether their suggestions would be helpful in general. Thus the main helpful outcome in these chapters is what their experiments reveal about the underlying evolutionary dynamics. Of these, by far the most interesting is Chapter 3.
Teller here has a clear rationale for introducing the co-evolution of his operators along side the base population. His algorithms are not based upon trees but rather networks representing automata acting upon an indexed memory. For such networks there is no obvious analogue of crossover, and in particular there is always the danger that a non-halting program will result from the action of an operator. One interesting result of this paper is that it did not result in evolving a set of generic operators that would work across different problems, but rather that different sets of operator were best for different problems. This, combined with the fact that they performed better than human-designed operators, goes quite some way to justify their inclusion in his architecture.A different line of enquiry follows that of Montana (1994). GP succeeded where earlier attempts to evolve programs failed because the search space is restricted to programs that can run. That is, whatever happens you won't get a syntax error. Koza had made sure that his functions would always give some result by restricting the grammar of the programs in his search space to those with only one type. Montana introduced strongly-typed genetic programming and thus further restricted the space of programs to be searched. In general the more specific the syntax the more restricted is the search space. The downside of this technique is the increased computation that is required for operations like crossover. The complexity introduced by the grammar means you can not just swap any pair of subtrees.
Two of the chapters in this volume look at such syntactic restrictions on the grammar of programs. One of these chapters (Chapter 18) merely extends Montana's study to more than two types. The other is Chapter 19.
This considers the more general case of an arbitrary syntactic constraint, specified by a context-free grammar. The author persuasively argues that such specification is far more than a hack but should be a normally used as part of the GP set-up. He supports this by exhibiting an efficient and general way to achieve this goal. This points up an important, and under-investigated area of GP, the relation between the formal language which specifies the search space and the problem characteristics. This seems to me a natural way to introduce domain-specific knowledge into the search process.There are three chapters (7, 8 and 9) that follow the trend in the first volume by suggesting ways to effectively develop subroutines in GP. The most interesting of these is Chapter 9.
Here a mechanism is suggested to try to dynamically learn the appropriate subroutines for a problem. This discovers new subroutines through generalisation over the population of programs (as well as deleting others). One of the important aspects of this technique is that is specifically triggered by a loss of variety in the population, as represented by an entropy measure. This can act to prevent 'premature' (i.e. wrong) convergence.Other areas of development in this volume include techniques to evolve recursive programs, efficiently representing the population using directed acyclic graphs, exploiting the implicit parallelism in GP, evolving the efficient use of data structures, and dynamically adapting the fitness function using minimum description length statistics. Also in this volume are four applications, including extracting features from images of the Arctic Ice and discovering appropriate abstractions of historical data to optimise decision tree classification.
This chapter discusses the details of using GP to design simple electrical circuits with given characteristics. What is most interesting, however is the last subsection (6.8 on page 131) which looks at the patentability of inventions discovered using GP, and concludes that they are!
Ryan and Ivan show how a set of transformations to change serial into parallel code can be induced. Since each transformation can be designed so that it does not change the effect of the code any complex transformation composed of them will not change it either. The GP algorithm is then free to discover an efficient parallel equivalent to the original code without the subsequent problem of having to show it has the same effect as the original. If this scales up, this could be employed in compilers to produce code suitable for parallel machines without too much extra effort required by the programmer using it.Two chapters discuss using GP to develop programs to control multiple agents/robots for co-operative action. This is a particularly apt application for GP due to the difficulty in programming anything but a small number of agents to co-operate to achieve a goal. Chapter 18 discusses the effect of splitting the fitness evaluation to correspond to sub-goals. The other is Chapter 19.
Here GP techniques of heterogeneous breeding are used to evolve programs for small groups of agents. Iba compares this technique with a form of reinforcement learning called Q-learning and shows it is superior to Q-learning for his problems. The chapter is significant because he demonstrates that rudimentary communication and negotiation can be effectively developed in this way. This is in sharp contrast to the effort required in designing communication protocols, reasoning and planning algorithms that are usually applied to achieve these abilities.A growing sub-theme is reflected in some slow steps to understanding the processes in GP by analysis and theory.
This chapter proposes several hypotheses about the causes of bloat in GP populations and then tests them experimentally. They propose a model of code growth as a random diffusive process towards the part of the search space where most of the programs are to be found. The results of their experiments back this model up and lead them to suggest mechanisms to help limit unnecessary growth. This chapter helps the reader gain an insight into what is happening in the population dynamics and not merely those of the best individual. I am a little worried about the scope of their analysis, because it seems to be based on the assumption that the proportion of programs of a given fitness is more-or-less independent of program size. While there is evidence that this is true, over most depths, it ignores the criticality of the minimum depth at which a candidate solution appears.
This proposes a model of top-down schema development in contrast to the established bottom-up "building block hypothesis". They analyse the development of solutions in terms of the development of rooted-tree schemata (that is a tree structure which can have any subtree grafted on to certain nodes). This gives some insights into the phenomenon of bloat and how to control it.
Before all else, it must be said that the field has been spectacularly successful. GP is the first technique that allows the practical induction of structure by computer, and because of this will come to dominate the field of evolutionary computation in utility and scope. It marks the transition from a technology where structure (except for maybe some parameters) has to be humanly specified to one where the design process is partially automated. Indeed, the discussion of the patentability of GP discovered designs at the end of Chapter 6 in volume 3, presages a time in the near future where invention will be routinely done as a machine/person team. This transition will be associated first with Koza's books and then these three, which is why some of the papers in these volumes will set trends and issues in GP that will be followed for many years to come.
On the minus side there is a tendency in many of the papers in these volumes to look for or claim techniques that are more efficient across the board. There is little acknowledgement that any technique will have its own conditions of application. That is, that there will inevitably be types of problems where one technique excels and others do less well, including the class of GP-based techniques itself (Wolpert and Macready, 1997). In all the papers in these three volumes, there are only a few that offer explicit guidance as to the sort of problem domains for which their technique would be unsuitable. This makes it difficult for an outsider to efficiently use the knowledge in these volumes, for it will not be clear to them which techniques they should try.
This caveat aside, these volumes form the kernel of any good reference section on GP. The papers are presented in a clear way with surveys and indications of other sources providing an adequate context. Once a researcher has understood and used the basic GP techniques, consulting this series is the next sensible step.
Kuhn, T. S. (1962). The Structure of Scientific Revolutions. University of Chicago Press, Chicago, IL, USA.
Montana, D. J. (1995). Strongly Typed Genetic Programming, Evolutionary Computation, 3:199-230
Wolpert, D. H. and Macready, W. G. (1997). No Free Lunch Theorems for Optimization, IEEE Transactions on Evolutionary Computation, 1:67-82.