Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - MultiPage
---

FAQ: comp.ai.genetic part 6/6 (A Guide to Frequently Asked Questions)

---
From: David.Beasley@cs.cf.ac.uk (David Beasley)
Newsgroups: comp.ai.genetic,comp.answers,news.answers
Subject: FAQ: comp.ai.genetic part 6/6 (A Guide to Frequently Asked Questions)
Supersedes: <part6_866739275@cs.cf.ac.uk>
Followup-To: comp.ai.genetic
Date: 8 Oct 1997 11:39:08 GMT
Organization: Posted through the Joint Cardiff Computing Service, Wales, UK
Expires: 10 Jan 1998 11:39:01 GMT
Message-ID: <part6_876310741@cs.cf.ac.uk>
References: <part5_876310741@cs.cf.ac.uk>
Summary: This is part 6 of a <trilogy> entitled "The Hitch-Hiker's Guide
     to Evolutionary Computation". A periodically published list of Frequently
     Asked Questions (and their answers) about Evolutionary Algorithms,
     Life and Everything. It should be read by anyone who whishes to post
     to the comp.ai.genetic newsgroup, preferably *before* posting.

Archive-name:   ai-faq/genetic/part6
Last-Modified:  10/8/97
Issue:          5.3

TABLE OF CONTENTS OF PART 6
     Q21: What are Gray codes, and why are they used?

     Q22: What test data is available?

     Q42: What is Life all about?
     Q42b: Is there a FAQ to this group?

     Q98: Are there any patents on EAs?

     Q99: A Glossary on EAs?

----------------------------------------------------------------------

Subject: Q21: What are Gray codes, and why are they used?

     The correct spelling is "Gray"---not  "gray",  "Grey",  or  "grey"---
     since Gray codes are named after the Frank Gray  who  patented  their
     use for shaft encoders in 1953  [1].   Gray  codes  actually  have  a
     longer history, and the inquisitive reader may want to  look  up  the
     August, 1972,  issue  of  Scientific  American,  which  contains  two
     articles of interest: one on the origin  of  binary  codes  [2],  and
     another by Martin  Gardner  on  some  entertaining  aspects  of  Gray
     codes [3].  Other references containing descriptions  of  Gray  codes
     and more modern, non-GA, applications include the second  edition  of
     Numerical  Recipes  [4],  Horowitz  and  Hill  [5],  Kozen  [6],  and
     Reingold [7].

     A Gray code represents  each  number  in  the  sequence  of  integers
     {0...2^N-1} as a binary string of length N  in  an  order  such  that
     adjacent integers have Gray code representations that differ in  only
     one bit position.  Marching through the  integer  sequence  therefore
     requires flipping just one bit at a time.  Some  call  this  defining
     property of Gray codes the "adjacency property" [8].

     Example (N=3):  The binary coding of {0...7} is {000, 001, 010,  011,
     100, 101, 110, 111}, while one Gray coding is {000,  001,  011,  010,
     110, 111, 101, 100}.  In essence, a Gray code takes a binary sequence
     and shuffles  it  to  form  some  new  sequence  with  the  adjacency
     property.  There exist,  therefore,   multiple   Gray   codings   for
     any  given  N.   The example shown here belongs to a  class  of  Gray
     codes that goes by the  fancy  name  "binary-reflected  Gray  codes".
     These  are  the  most  commonly  seen  Gray  codes,  and  one  simple
     scheme  for generationg such a Gray code sequence says,  "start  with
     all  bits zero and successively flip the right-most bit that produces
     a new string."

     Hollstien [9] investigated the use of GAs for optimizing functions of
     two variables and claimed that  a  Gray  code  representation  worked
     slightly better than the binary representation.  He attributed   this
     difference to the adjacency property of Gray codes.   Notice  in  the
     above example that the step from three to four requires the  flipping
     of all the bits in the binary representation.  In  general,  adjacent
     integers in the binary representaion often lie many bit flips  apart.
     This  fact makes it less likely that a MUTATION operator  can  effect
     small changes for a binary-coded INDIVIDUAL.

     A Gray code representation seems to  improve  a  mutation  operator's
     chances of making incremental improvements, and a  close  examination
     suggests why.  In  a  binary-coded  string  of  length  N,  a  single
     mutation in the most significant  bit  (MSB)  alters  the  number  by
     2^(N-1).  In a Gray-coded string, fewer mutations lead  to  a  change
     this large.  The user of Gray codes does, however, pay  a  price  for
     this feature: those "fewer mutations" lead to  much  larger  changes.
     In the Gray code illustrated above, for example, a single mutation of
     the left-most bit changes a zero to a seven and vice-versa, while the
     largest  change a single mutation can make to a corresponding binary-
     coded individual is always four.  One might still view this aspect of
     Gray codes with some favor:  most  mutations  will  make  only  small
     changes, while the occasional  mutation  that  effects  a  truly  big
     change may initiate EXPLORATION of an  entirely  new  region  in  the
     space of CHROMOSOMEs.

     The algorithm for converting between the binary-reflected  Gray  code
     described above  and  the  standard  binary  code  turns  out  to  be
     surprisingly simple to state.  First label the bits of a binary-coded
     string B[i], where larger i's represent more  significant  bits,  and
     similarly label the corresponding Gray-coded string G[i].  We convert
     one to the other as follows:  Copy the most  significant  bit.   Then
     for each smaller i  do  either  G[i] = XOR(B[i+1], B[i])---to convert
     binary to  Gray---or B[i] = XOR(B[i+1], G[i])---to  convert  Gray  to
     binary.

     One may easily implement the above algorithm in C.   Imagine  you  do
     something like

	  typedef unsigned short ALLELE;

     and then use type "allele" for each bit in your chromosome, then  the
     following two functions will convert between  binary  and  Gray  code
     representations.  You must pass them the address  of  the  high-order
     bits for each of the two strings  as  well  as  the  length  of  each
     string.  (See  the  comment  statements  for  examples.)   NB:  These
     functions assume a chromosome arranged  as  shown  in  the  following
     illustration.

     index:    C[9]                                                    C[0]
	       *-----------------------------------------------------------*
     Char C:   |  1  |  1  |  0  |  0  |  1  |  0  |  1  |  0  |  0  |  0  |
	       *-----------------------------------------------------------*
	       ^^^^^                                                 ^^^^^
	       high-order bit                                  low-order bit

 C CODE
     /* Gray <==> binary conversion routines */
     /* written by Dan T. Abell, 7 October 1993 */
     /* please send any comments or suggestions */
     /* to dabell@quark.umd.edu */

     void gray_to_binary (Cg, Cb, n)
     /* convert chromosome of length n from */
     /* Gray code to binary representation */
     allele    *Cg,*Cb;
     int  n;
     {
	  int j;

	  *Cb = *Cg;               /* copy the high-order bit */
	  for (j = 0; j < n; j++) {
	       Cb--; Cg--;         /* for the remaining bits */
	       *Cb= *(Cb+1)^*Cg;   /* do the appropriate XOR */
	  }
     }

     void binary_to_gray(Cb, Cg, n)
     /* convert chromosome of length n from */
     /* binary to Gray code representation */
     allele    *Cb, *Cg;
     int  n;
     {
	  int j;

	  *Cg = *Cb;               /* copy the high-order bit */
	  for (j = 0; j < n; j++) {
	       Cg--; Cb--;         /* for the remaining bits */
	       *Cg= *(Cb+1)^*Cb;   /* do the appropriate XOR */
	  }
     }

     References

     [1]  F.  Gray,  "Pulse  Code  Communication", U. S. Patent 2 632 058,
     March 17, 1953.

     [2] F. G. Heath, "Origins of the Binary  Code",  Scientific  American
     v.227,n.2 (August, 1972) p.76.

     [3]   Martin   Gardner,  "Mathematical  Games",  Scientific  American
     v.227,n.2 (August, 1972) p.106.

     [4] William H. Press, et al., Numerical Recipes in C, Second  Edition
     (Cambridge University Press, 1992).

     [5]  Paul  Horowitz and Winfield Hill, The Art of Electronics, Second
     Edition (Cambridge University Press, 1989).

     [6] Dexter Kozen, The Design and Analysis  of  Algorithms  (Springer-
     Verlag, New York, NY, 1992).

     [7]  Edward  M.  Reingold, et al., Combinatorial Algorithms (Prentice
     Hall, Englewood Cliffs, NJ, 1977).

     [8] David E. Goldberg, Genetic Algorithms  in  Search,  Optimization,
     and Machine Learning (Addison-Wesley, Reading, MA, 1989).

     [9]  R.  B.  Hollstien,  Artificial  Genetic  Adaptation  in Computer
     Control Systems (PhD thesis, University of Michigan, 1971).

------------------------------

Subject: Q22: What test data is available?

 TSP DATA
     There is a TSP library (TSPLIB) available which has many  solved  and
     semi-solved TSPs and different variants. The library is maintained by
     Gerhard Reinelt <reinelt@ares.iwr.Uni-Heidelberg.de>. It is available
     from          various          FTP          sites,         including:
     softlib.cs.rice.edu:/pub/tsplib/tsblib.tar

 OPERATIONAL RESERACH DATA
     Information about  Operational  Research  test  problems  in  a  wide
     variety  of  areas  can be obtained by emailing <o.rlibrary@ic.ac.uk>
     with the body of the email message being just the word  "info".   The
     files  in  OR-Library  are  also  available  via  anonymous  FTP from
     mscmga.ms.ic.ac.uk:/pub/  A  WWW  page  is  also  available  at  URL:
     http://mscmga.ms.ic.ac.uk/  Instructions on how to use OR-Library can
     be found in the file "paper.txt", or  in  the  article:  J.E.Beasley,
     "OR-Library:  distributing test problems by electronic mail", Journal
     of the Operational Research Society 41(11) (1990) pp1069-1072.

     The following is a list of some of the topics covered.
     File                    Problem area

     assigninfo.txt          Assignment problem
     deainfo.txt             Data envelopment analysis
     gapinfo.txt             Generalised assignment problem
     mipinfo.txt             Integer programming
     lpinfo.txt              Linear programming
     scpinfo.txt             Set covering
     sppinfo.txt             Set partitioning
     tspinfo.txt             Travelling salesman problem
     periodtspinfo.txt   Period travelling salesman problem
     netflowinfo.txt         Network flow problem

			     Location:
     capmstinfo.txt           capacitated minimal spanning tree
     capinfo.txt                     capacitated warehouse location
     pmedinfo.txt                    p-median
     uncapinfo.txt                   uncapacitated warehouse location
     mknapinfo.txt                   Multiple knapsack problem
     qapinfo.txt                     Quadratic assignment problem
     rcspinfo.txt                    Resource constrained shortest path
     phubinfo.txt                    p-hub location problem

			     Scheduling:
     airlandinfo.txt                 Aircraft Landing Problem
     cspinfo.txt                     Crew scheduling
     flowshopinfo.txt                flow shop
     jobshopinfo.txt                 job shop
     openshopinfo.txt                open shop
     tableinfo.txt                   timetabling problem

			     Steiner:
     esteininfo.txt                  Euclidean Steiner problem
     rsteininfo.txt                  Rectilinear Steiner problem
     steininfo.txt                   Steiner problem in graphs

			     Two-dimensional cutting:
     assortinfo.txt                  assortment problem
     cgcutinfo.txt                   constrained guillotine
     ngcutinfo.txt                   constrained non-guillotine
     gcutinfo.txt                    unconstrained guillotine

			     Vehicle routing:
     areainfo.txt                    fixed areas
     fixedinfo.txt                   fixed routes
     periodinfo.txt                  period routing
     vrpinfo.txt                     single period
     multivrpinfo.txt                multiple depot vehicle routing problem

 OTHER DATA
     William Spears <spears@aic.nrl.navy.mil> maintains a WWW page titled:
     Test  Functions  for  Evolutionary Algorithms which contians links to
     various         sources          of          test          functions.
     http://www.aic.nrl.navy.mil:80/~spears/functs.html

     ENCORE  (see  Q15.3)  also  contains  some test data. See directories
     under /etc/data/

------------------------------

Subject: Q42: What is Life all about?

     42

     References
     Adams, D. (1979) "The Hitch Hiker's Guide to the Galaxy", London: Pan
     Books.

     Adams, D. (1980) "The Restaurant at the End of the Universe", London:
     Pan Books.

     Adams, D. (1982) "Life, the Universe  and  Everything",  London:  Pan
     Books.

     Adams,  D. (1984) "So long, and thanks for all the Fish", London: Pan
     Books.

     Adams, D. (1992) "Mostly Harmless", London: Heinemann.

------------------------------

Subject: Q42b: Is there a FAQ to this group?

     Yes.

------------------------------

Subject: Q98: Are there any patents on EAs?

     Process  patents  have  been  issued  both  for  the  Bucket  Brigade
     Algorithm in CLASSIFIER SYSTEMs: U.S. patent #4,697,242: J.H. Holland
     and A. Burks, "Adaptive computing  system  capable  of  learning  and
     discovery",  1985,  issued  Sept  29  1987;  and  for GP: U.S. patent
     #4,935,877 (to John Koza).

     This FAQ does not attempt to provide legal advice.  However,  use  of
     the  Lisp  code  in the book [KOZA92] is freely licensed for academic
     use. Although those wishing to make commercial  use  of  any  process
     should obviously consult any patent holders in question, it is pretty
     clear that it's not  in  anyone's  best  interests  to  stifle  GA/GP
     research and/or development. Commercial licenses much like those used
     for CAD software can presumably be obtained  for  the  use  of  these
     processes where necessary.

     Jarmo  Alander's  massive  bibliography of GAs (see Q10.8) includes a
     (probably) complete list of all currently  know  patents.   There  is
     also  a  periodic posting on comp.ai.neural-nets by Gregory Aharonian
     <srctran@world.std.com> about patents on Artificial  Neural  Networks
     (ANNs).

------------------------------

Subject: Q99: A Glossary on EAs?

 1
     1/5 SUCCESS RULE:
	  Derived  by  I.  Rechenberg,  the  suggestion that when Gaussian
	  MUTATIONs are applied to real-valued vectors  in  searching  for
	  the  minimum of a function, a rule-of-thumb to attain good rates
	  of error convergence is  to  adapt  the  STANDARD  DEVIATION  of
	  mutations  to  generate  one superior solution out of every five
	  attempts.

 A
     ADAPTIVE BEHAVIOUR:
	  "...underlying mechanisms that allow animals,  and  potentially,
	  ROBOTs to adapt and survive in uncertain environments" --- Meyer
	  & Wilson (1991), [SAB90]

     AI:  See ARTIFICIAL INTELLIGENCE.

     ALIFE:
	  See ARTIFICIAL LIFE.

     ALLELE :
	  (biol) Each GENE is able to occupy only a particular region of a
	  CHROMOSOME,  it's  locus. At any given locus there may exist, in
	  the POPULATION, alternative forms of the gene. These alternative
	  are called alleles of one another.

	  (EC)  The  value of a gene.  Hence, for a binary representation,
	  each gene may have an ALLELE of 0 or 1.

     ARTIFICIAL INTELLIGENCE:
	  "...the study of how to make computers do things  at  which,  at
	  the moment, people are better" --- Elaine  Rich (1988)

     ARTIFICIAL LIFE:
	  Term  coined  by  Christopher  G.  Langton for his 1987 [ALIFEI]
	  conference. In the preface of the proceedings he  defines  ALIFE
	  as  "...the study of simple computer generated hypothetical life
	  forms, i.e.  life-as-it-could-be."

 B
     BUILDING BLOCK:
	  (EC) A small, tightly clustered group of GENEs  which  have  co-
	  evolved   in  such  a  way  that  their  introduction  into  any
	  CHROMOSOME will be likely to  give  increased  FITNESS  to  that
	  chromosome.

	  The  "building  block  hypothesis" [GOLD89] states that GAs find
	  solutions by first finding as many BUILDING BLOCKs as  possible,
	  and then combining them together to give the highest fitness.

 C
     CENTRAL DOGMA:
	  (biol)  The  dogma  that  nucleic acids act as templates for the
	  synthesis of proteins, but never the  reverse.  More  generally,
	  the dogma that GENEs exert an influence over the form of a body,
	  but the form of a body is never  translated  back  into  genetic
	  code: acquired characteristics are not inherited. cf LAMARCKISM.

	  (GA) The dogma that the  behaviour  of  the  algorithm  must  be
	  analysed using the SCHEMA THEOREM.

	  (life in general) The dogma that this all is useful in a way.

	  "You  guys  have  a dogma. A certain irrational set of believes.
	  Well, here's  my  irrational  set  of  beliefs.  Something  that
	  works."

	  --- Rodney A. Brooks, [LEVY92]

     CFS: See CLASSIFIER SYSTEM.

     CHROMOSOME:
	  (biol)  One  of  the  chains of DNA found in cells.  CHROMOSOMEs
	  contain GENEs, each encoded as a subsection of  the  DNA  chain.
	  Chromosomes  are  usually  present  in all cells in an organism,
	  even though only a minority of them will be active  in  any  one
	  cell.
	  (EC)  A datastructure which holds a `string' of task parameters,
	  or genes.  This may be stored, for example,  as  a  binary  bit-
	  string, or an array of integers.

     CLASSIFIER SYSTEM:
	  A  system which takes a (set of) inputs, and produces a (set of)
	  outputs which indicate some classification of  the  inputs.   An
	  example  might take inputs from sensors in a chemical plant, and
	  classify them in terms of: 'running  ok',  'needs  more  water',
	  'needs  less water', 'emergency'. See Q1.4 for more information.

     COMBINATORIAL OPTIMIZATION:
	  Some tasks involve combining a set of entities in a specific way
	  (e.g.   the  task  of building a house). A general combinatorial
	  task involves deciding (a) the specifications of those  entities
	  (e.g.  what  size, shape, material to make the bricks from), and
	  (b) the way in which those entities are brought  together  (e.g.
	  the  number  of  bricks,  and  their relative positions). If the
	  resulting combination of entities can in some  way  be  given  a
	  FITNESS  score,  then  COMBINATORIAL OPTIMIZATION is the task of
	  designing a set of entities,  and  deciding  how  they  must  be
	  configured,  so  as  to  give  maximum  fitness.  cf ORDER-BASED
	  PROBLEM.

     COMMA STRATEGY:
	  Notation originally proposed in  EVOLUTION  STRATEGIEs,  when  a
	  POPULATION  of "mu" PARENTs generates "lambda" OFFSPRING and the
	  mu parents are discarded, leving only the lambda INDIVIDUALs  to
	  compete  directly.   Such  a process is written as a (mu,lambda)
	  search.  The process of  only  competing  offspring  then  is  a
	  "comma strategy." cf.  PLUS STRATEGY.

     CONVERGED:
	  A  GENE is said to have CONVERGED when 95% of the CHROMOSOMEs in
	  the POPULATION all contain the same ALLELE for  that  gene.   In
	  some  circumstances,  a population can be said to have converged
	  when all genes have converged. (However, this  is  not  true  of
	  populations containing multiple SPECIES, for example.)

	  Most  people  use  "convergence" fairly loosely, to mean "the GA
	  has stopped finding new, better solutions".  Of course,  if  you
	  wait  long  enough,  the  GA  will  *eventually*  find  a better
	  solution (unless you have already  found  the  global  optimum).
	  What  people  really mean is "I'm not willing to wait for the GA
	  to find a new, better  solution,  because  I've  already  waited
	  longer than I wanted to and it hasn't improved in ages."

	  An  interesting discussion on convergence by Michael Vose can be
	  found     in      GA-Digest      v8n22,      available      from
	  ftp.aic.nrl.navy.mil:/pub/galist/digests/v8n22

     CONVERGENCE VELOCITY:
	  The rate of error reduction.

     COOPERATION:
	  The  behavior  of two or more INDIVIDUALs acting to increase the
	  gains of all participating individuals.

     CROSSOVER:
	  (EC) A REPRODUCTION OPERATOR which forms  a  new  CHROMOSOME  by
	  combining  parts  of  each  of  two  `parent'  chromosomes.  The
	  simplest form is single-point CROSSOVER, in which  an  arbitrary
	  point  in  the  chromosome  is  picked. All the information from
	  PARENT A is copied from the start up  to  the  crossover  point,
	  then  all  the  information  from  parent  B  is copied from the
	  crossover point to the end of the chromosome. The new chromosome
	  thus  gets the head of one parent's chromosome combined with the
	  tail of the other.  Variations exist which  use  more  than  one
	  crossover  point,  or  combine information from parents in other
	  ways.

	  (biol) A  complicated  process  which typically takes  place  as
	  follows:   chromosomes,  while  engaged  in  the  production  of
	  GAMETEs, exchange portions of genetic material.  The  result  is
	  that  an  almost  infinite  variety  of gametes may be produced.
	  Subsequently,  during  sexual  REPRODUCTION,  male  and   female
	  gametes  (i.e. sperm and ova) fuse to produce a new DIPLOID cell
	  with a pair of chromosomes.

	  In  [HOLLAND92]  the sentence "When sperm and ova fuse, matching
	  chromosomes line up with one another their length, thus swapping
	  genetic material" is thus  wrong,  since  these  two  activities
	  occur  in  different  parts  of  the life cycle.  [eds note:  If
	  sexual reproduction (the  Real  Thing) worked like in GAs,  then
	  Holland  would  be  right,  but  as we all know,  it's  not  the
	  case.  We just encountered a Freudian  slip  of  a  Grandmaster.
	  BTW:   even  the German translation of  this  article  has  this
	  "bug", although it's well-hidden by the translator.]

     CS:  See CLASSIFIER SYSTEM.

 D
     DARWINISM:
	  (biol) Theory of EVOLUTION, proposed by Darwin,  that  evolution
	  comes    about    through    random   variation   of   heritable
	  characteristics, coupled with natural SELECTION (survival of the
	  fittest).  A  physical mechanism for this, in terms of GENEs and
	  CHROMOSOMEs, was discovered many  years  later.   DARWINISM  was
	  combined  with  the selectionism of Weismann and the genetics of
	  Mendel  to  form  the   Neo-Darwinian   Synthesis   during   the
	  1930s-1950s by T. Dobzhansky, E. Mayr, G. Simpson, R. Fisher, S.
	  Wright, and others. cf LAMARCKISM.

	  The talk.origins FAQ contains more details (See  Q10.7).   Also,
	  the  "Dictionary  of Darwinism and of Evolution" (Ed. by Patrick
	  Tort) was published in early 1996. It contains a vast amount  of
	  information   about   what   Darwinism   is  and  (perhaps  more
	  importantly)    is    not.      Further     information     from
	  http://www.planete.net/~ptort/darwin/evolengl.html  (in  various
	  languages).

	  (EC) Theory which inspired all branches of EC.

     DECEPTION:
	  The condition where the  combination  of  good  BUILDING  BLOCKs
	  leads   to  reduced  FITNESS,  rather  than  increased  fitness.
	  Proposed by [GOLD89] as a reason for the failure of GAs on  many
	  tasks.

     DIPLOID:
	  (biol)  This  refers to a cell which contains two copies of each
	  CHROMOSOME.  The copies are homologous  i.e.  they  contain  the
	  same  GENEs  in  the same sequence. In many sexually reproducing
	  SPECIES, the genes in one of the sets of chromosomes  will  have
	  been inherited from the father's GAMETE (sperm), while the genes
	  in the other set of chromosomes are  from  the  mother's  gamete
	  (ovum).

     DNA: (biol) Deoxyribonucleic Acid, a double stranded macromolecule of
	  helical structure  (comparable  to  a  spiral  staircase).  Both
	  single  strands  are  linear,  unbranched nucleic acid molecules
	  build up from  alternating  deoxyribose  (sugar)  and  phosphate
	  molecules.  Each  deoxyribose  part  is  coupled to a nucleotide
	  base, which is responsible for establishing  the  connection  to
	  the  other  strand  of  the DNA.  The 4 nucleotide bases Adenine
	  (A), Thymine (T), Cytosine (C) and Guanine (G) are the  alphabet
	  of  the genetic information. The sequences of these bases in the
	  DNA molecule determines the building plan of any organism.  [eds
	  note:  suggested  reading:  James  D.  Watson (1968) "The Double
	  Helix", London: Weidenfeld and Nicholson]

	  (literature) Douglas Noel Adams,  contemporary  Science  Fiction
	  comedy writer. Published "The Hitch-Hiker's Guide to the Galaxy"
	  when he was 25 years old, which made him one  of  the  currently
	  most  successful  British  authors.   [eds  note:  interestingly
	  Watson was also 25 years old, when he discovered the  DNA;  both
	  events  are  probably not interconnected; you might also want to
	  look at: Neil Gaiman's  (1987)  "DON'T  PANIC  --  The  Official
	  Hitch-Hiker's  Guide to the Galaxy companion", and of course get
	  your hands on the wholly remarkable FAQ in alt.fan.douglas-adams
	  ]

     DNS: (biol) Desoxyribonukleinsaeure, German for DNA.

	  (comp) The Domain Name System, a distributed database system for
	  translating   computer   names    (e.g.    lumpi.informatik.uni-
	  dortmund.de)    into   numeric   Internet,   i.e.   IP-addresses
	  (129.217.36.140) and vice-versa.  DNS allows you  to  hook  into
	  the  net  without  remembering long lists of numeric references,
	  unless your system administrator  has  incorrectly  set-up  your
	  site's system.

 E
     EC:  See EVOLUTIONARY COMPUTATION.

     ELITISM:
	  ELITISM  (or  an  elitist  strategy)  is  a  mechanism  which is
	  employed in some EAs which ensures that the CHROMOSOMEs  of  the
	  most highly fit member(s) of the POPULATION are passed on to the
	  next GENERATION without  being  altered  by  GENETIC  OPERATORs.
	  Using elitism ensures that the mamimum FITNESS of the population
	  can never reduce  from  one  generation  to  the  next.  Elitism
	  usually brings about a more rapid convergence of the population.
	  In some applications elitism improves the chances of locating an
	  optimal INDIVIDUAL, while in others it reduces it.

     ENCORE:
	  The  EvolutioNary Computation REpository Network.  An collection
	  of FTP servers/World  Wide  Web  sites  holding  all  manner  of
	  interesting   things   related   to  EC.   See  Q15.3  for  more
	  information.

     ENVIRONMENT:
	  (biol) That which  surrounds  an  organism.  Can  be  'physical'
	  (abiotic),  or  biotic.   In both, the organism occupies a NICHE
	  which influences its FITNESS within the  total  ENVIRONMENT.   A
	  biotic  environment  may  present   frequency-dependent  fitness
	  functions within a  POPULATION,  that  is,  the  fitness  of  an
	  organism's  behaviour  may  depend upon how many others are also
	  doing it.  Over several  GENERATIONs,  biotic  environments  may
	  foster   co-evolution,  in  which  fitness  is  determined  with
	  SELECTION partly by other SPECIES.

     EP:  See EVOLUTIONARY PROGRAMMING.

     EPISTASIS:
	  (biol) A "masking" or "switching" effect among GENEs.  A biology
	  textbook says: "A gene is said to be epistatic when its presence
	  suppresses the effect of a gene  at  another  locus.   Epistatic
	  genes  are  sometimes  called  inhibiting genes because of their
	  effect on other genes which are described as hypostatic."

	  (EC) When EC  researchers  use  the  term  EPISTASIS,  they  are
	  generally  referring  to  any  kind  of strong interaction among
	  genes, not just masking effects. A possible definition is:

	  Epistasis is  the  interaction  between  different  genes  in  a
	  CHROMOSOME.   It  is  the  extent  to  which the contribution to
	  FITNESS of one gene depends on the values of other genes.

	  Problems with little  or  no  epistasis  are  trivial  to  solve
	  (hillclimbing  is sufficient). But highly epistatic problems are
	  difficult to solve, even for GAs.   High  epistasis  means  that
	  BUILDING BLOCKs cannot form, and there will be DECEPTION.

     ES:  See EVOLUTION STRATEGY.

     EVOLUTION:
	  That  process  of  change  which is assured given a reproductive
	  POPULATION in which there are (1) varieties of INDIVIDUALs, with
	  some  varieties being (2) heritable, of which some varieties (3)
	  differ in FITNESS (reproductive success). (See the  talk.origins
	  FAQ for discussion on this (See Q10.7).)

	  "Don't assume that all people who accept EVOLUTION are atheists"

	  --- Talk.origins FAQ

     EVOLUTION STRATEGIE:

     EVOLUTION STRATEGY:
	  A type of evolutionary algorithm developed in the early 1960s in
	  Germany.   It employs real-coded parameters, and in its original
	  form, it relied on  MUTATION  as  the  search  operator,  and  a
	  POPULATION  size of one. Since then it has evolved to share many
	  features  with  GENETIC   ALGORITHMs.    See   Q1.3   for   more
	  information.

     EVOLUTIONARILY STABLE STRATEGY:
	  A  strategy that does well in a POPULATION dominated by the same
	  strategy.  (cf Maynard Smith, 1974)  Or,  in  other  words,  "An
	  'ESS'  ...  is  a  strategy  such  that, if all the members of a
	  population adopt it, no mutant strategy can  invade."   (Maynard
	  Smith "Evolution and the Theory of Games", 1982).

     EVOLUTIONARY COMPUTATION:
	  Encompasses  methods of simulating EVOLUTION on a computer.  The
	  term is relatively new and represents an effort  bring  together
	  researchers  who have been working in closely related fields but
	  following  different  paradigms.   The  field  is  now  seen  as
	  including  research in GENETIC ALGORITHMs, EVOLUTION STRATEGIEs,
	  EVOLUTIONARY PROGRAMMING, ARTIFICIAL LIFE, and so forth.  For  a
	  good overview see the editorial introduction to Vol. 1, No. 1 of
	  "Evolutionary Computation" (MIT Press, 1993).  That, along  with
	  the  papers  in  the  issue,  should  give  you  a  good idea of
	  representative research.

     EVOLUTIONARY PROGRAMMING:
	  An evolutionay algorithm developed in the mid  1960s.  It  is  a
	  stochastic  OPTIMIZATION  strategy,  which is similar to GENETIC
	  ALGORITHMs, but dispenses with  both  "genomic"  representations
	  and  with  CROSSOVER  as  a REPRODUCTION OPERATOR.  See Q1.2 for
	  more information.

     EVOLUTIONARY SYSTEMS:
	  A process or system which employs the evolutionary  dynamics  of
	  REPRODUCTION, MUTATION, competition and SELECTION.  The specific
	  forms of these  processes  are  irrelevant  to  a  system  being
	  described as "evolutionary."

     EXPECTANCY:
	  Or  expected  value.   Pertaining  to a random variable X, for a
	  continuous random variable, the expected value is:
	  E(X) = INTEGRAL(-inf, inf) [X f(X) dX].
	  The discrete expectation takes a similar form using a  summation
	  instead of an integral.

     EXPLOITATION:
	  When  traversing  a SEARCH SPACE, EXPLOITATION is the process of
	  using information gathered from previously visited points in the
	  search  space  to  determine which places might be profitable to
	  visit next.  An  example  is  hillclimbing,  which  investigates
	  adjacent  points in the search space, and moves in the direction
	  giving  the  greatest   increase   in   FITNESS.    Exploitation
	  techniques are good at finding local maxima.

     EXPLORATION:
	  The  process of visiting entirely new regions of a SEARCH SPACE,
	  to  see  if  anything  promising  may  be  found  there.  Unlike
	  EXPLOITATION,  EXPLORATION  involves  leaps  into  the  unknown.
	  Problems which have many local  maxima  can  sometimes  only  be
	  solved by this sort of random search.

 F
     FAQ: Frequently Asked Questions. See definition given before the main
	  table of contents.

     FITNESS:
	  (biol) Loosely: adaptedness.  Often measured as,  and  sometimes
	  equated to, relative reproductive success.  Also proportional to
	  expected time to extinction.  "The fit are those who  fit  their
	  existing  ENVIRONMENTs  and  whose  descendants  will fit future
	  environments."  (J.  Thoday,  "A  Century  of  Darwin",   1959).
	  Accidents of history are relevant.

	  (EC)  A  value assigned to an INDIVIDUAL which reflects how well
	  the individual solves the task in hand. A "fitness function"  is
	  used  to  map  a  CHROMOSOME  to  a  FITNESS  value.  A "fitness
	  landscape" is the hypersurface obtained by applying the  fitness
	  function to every point in the SEARCH SPACE.

     FUNCTION OPTIMIZATION:
	  For  a  function  which  takes  a set of N input parameters, and
	  returns a single output value, F, FUNCTION OPTIMIZATION  is  the
	  task  of  finding  the  set(s)  of  parameters which produce the
	  maximum (or minimum) value of F. Function OPTIMIZATION is a type
	  of VALUE-BASED PROBLEM.

     FTP: File  Transfer  Protocol. A system which allows the retrieval of
	  files stored on a remote computer. Basic FTP requires a password
	  before  access  can  be gained to the remote computer. Anonymous
	  FTP  does  not  require  a  special   password:   after   giving
	  "anonymous"  as  the user name, any password will do (typically,
	  you give your email address at this point). Files  available  by
	  FTP are specified as <ftp-site-name>:<the-complete-filename> See
	  Q15.5.

     FUNCTION SET:
	  (GP) The set of operators used in GP. These functions label  the
	  internal (non-leaf) points of the parse trees that represent the
	  programs in the POPULATION.  An example FUNCTION  SET  might  be
	  {+, -, *}.
 G
     GA:  See GENETIC ALGORITHM.

     GAME THEORY:
	  A  mathematical theory originally developed for human games, and
	  generalized to human economics and  military  strategy,  and  to
	  EVOLUTION in the theory of EVOLUTIONARILY STABLE STRATEGY.  GAME
	  THEORY comes into it's own wherever the optimum  policy  is  not
	  fixed,  but  depends upon the policy which is statistically most
	  likely to be adopted by opponents.

     GAMETE:
	  (biol) Cells which carry genetic information from their  PARENTs
	  for  the  purposes  of  sexual  REPRODUCTION.   In animals, male
	  GAMETEs are called sperm, female gametes are called ova. Gametes
	  have a HAPLOID number of CHROMOSOMEs.

     GAUSSIAN DISTRIBUTION:
	  See NORMALLY DISTRIBUTED.

     GENE:
	  (EC)  A  subsection  of a CHROMOSOME which (usually) encodes the
	  value of a single parameter.

	  (biol) The fundamental unit of inheritance, comprising a segment
	  of  DNA  that  codes  for  one  or several related functions and
	  occupies a fixed position (locus) on the  chromosome.   However,
	  the  term  may  be  defined  in  different  ways  for  different
	  purposes.  For a fuller story, consult a book on  genetics  (See
	  Q10.7).

     GENE-POOL:
	  The  whole  set of GENEs in a breeding POPULATION.  The metaphor
	  on which the term is based  de-emphasizes  the  undeniable  fact
	  that  genes actually go about in discrete bodies, and emphasizes
	  the idea of genes flowing about the world like a liquid.

	  Everybody out of the gene-pool, now!

	  --- Author prefers to be anonymous

     GENERATION:
	  (EC) An iteration of the measurement of FITNESS and the creation
	  of a new POPULATION by means of REPRODUCTION OPERATORs.

     GENETIC ALGORITHM:
	  A  type  of  EVOLUTIONARY  COMPUTATION  devised  by John Holland
	  [HOLLAND92].   A  model  of  machine  learning   that   uses   a
	  genetic/evolutionary  metaphor.  Implementations  typically  use
	  fixed-length  character  strings  to  represent  their   genetic
	  information,  together  with  a  POPULATION of INDIVIDUALs which
	  undergo CROSSOVER and MUTATION  in  order  to  find  interesting
	  regions of the SEARCH SPACE.  See Q1.1 for more information.

     GENETIC DRIFT:
	  Changes  in  gene/allele  frequencies  in a POPULATION over many
	  GENERATIONs,  resulting  from  chance  rather  than   SELECTION.
	  Occurs  most  rapidly  in  small  populations.  Can lead to some
	  ALLELEs  becoming   `extinct',   thus   reducing   the   genetic
	  variability in the population.

     GENETIC PROGRAMMING:
	  GENETIC  ALGORITHMs applied to programs.  GENETIC PROGRAMMING is
	  more expressive than fixed-length character string  GAs,  though
	  GAs  are  likely  to  be  more  efficient  for  some  classes of
	  problems.  See Q1.5 for more information.
     GENETIC OPERATOR:
	  A search operator acting on a coding structure that is analogous
	  to a GENOTYPE of an organism (e.g. a CHROMOSOME).

     GENOTYPE:
	  The   genetic   composition  of  an  organism:  the  information
	  contained in the GENOME.

     GENOME:
	  The entire collection of GENEs (and hence CHROMOSOMEs) possessed
	  by an organism.

     GLOBAL OPTIMIZATION:
	  The  process  by  which  a  search  is made for the extremum (or
	  extrema) of a functional  which,  in  EVOLUTIONARY  COMPUTATION,
	  corresponds  to  the  FITNESS  or error function that is used to
	  assess the PERFORMANCE of any INDIVIDUAL.

     GP:  See GENETIC PROGRAMMING.

 H
     HAPLOID:
	  (biol) This refers to cell which contains a single CHROMOSOME or
	  set  of  chromosomes,  each  consisting  of a single sequence of
	  GENEs.  An example is a GAMETE.  cf DIPLOID.

	  In EC, it is usual for INDIVIDUALs to be HAPLOID.

     HARD SELECTION:
	  SELECTION acts on competing INDIVIDUALs.   When  only  the  best
	  available   individuals   are  retained  for  generating  future
	  progeny, this is termed "hard selection."   In  contrast,  "soft
	  selection"  offers  a  probabilistic  mechanism  for  maitaining
	  individuals to be PARENTs of future progeny  despite  possessing
	  relatively poorer objective values.

 I
     INDIVIDUAL:
	  A  single  member  of  a  POPULATION.   In  EC,  each INDIVIDUAL
	  contains a CHROMOSOME  (or,  more  generally,  a  GENOME)  which
	  represents a possible solution to the task being tackled, i.e. a
	  single point in the SEARCH SPACE.  Other information is  usually
	  also stored in each individual, e.g. its FITNESS.

     INVERSION:
	  (EC)  A  REORDERING  operator  which  works by selecting two cut
	  points in a CHROMOSOME, and reversing the order of all the GENEs
	  between those two points.

 L
     LAMARCKISM:
	  Theory  of  EVOLUTION  which preceded Darwin's. Lamarck believed
	  that evolution came about through the  inheritance  of  acquired
	  characteristics.  That is, the skills or physical features which
	  an INDIVIDUAL acquires during its lifetime can be passed  on  to
	  its  OFFSPRING.   Although  Lamarckian inheritance does not take
	  place in nature, the idea has been usefully applied by  some  in
	  EC.  cf DARWINISM.

     LCS: See LEARNING CLASSIFIER SYSTEM.

     LEARNING CLASSIFIER SYSTEM:
	  A  CLASSIFIER  SYSTEM which "learns" how to classify its inputs.
	  This often involves "showing" the system many examples of  input
	  patterns,  and their corresponding correct outputs. See Q1.4 for
	  more information.

 M
     MIGRATION:
	  The transfer of (the GENEs  of)  an  INDIVIDUAL  from  one  SUB-
	  POPULATION to another.

     MOBOT:
	  MOBile ROBOT.  cf ROBOT.

     MUTATION:
	  (EC)  a  REPRODUCTION  OPERATOR  which forms a new CHROMOSOME by
	  making (usually small) alterations to the values of GENEs  in  a
	  copy of a single, PARENT chromosome.

 N
     NFL: See NO FREE LUNCH.

     NICHE:
	  (biol)  In  natural ecosystems, there are many different ways in
	  which animals may survive (grazing, hunting, on the  ground,  in
	  trees,   etc.),   and   each  survival  strategy  is  called  an
	  "ecological niche."  SPECIES which occupy different NICHEs (e.g.
	  one eating plants, the other eating insects) may coexist side by
	  side without competition, in a stable way. But  if  two  species
	  occupying  the  same niche are brought into the same area, there
	  will be competition,  and  eventually  the  weaker  of  the  two
	  species  will  be  made  (locally)  extinct.  Hence diversity of
	  species depends on them occupying a diversity of niches  (or  on
	  geographical separation).

	  (EC)  In  EC,  we  often  want  to  maintain  diversity  in  the
	  POPULATION.  Sometimes a FITNESS function may  be  known  to  be
	  multimodal, and we want to locate all the peaks. We may consider
	  each peak in the fitness function as analogous to  a  niche.  By
	  applying   techniques   such  as  fitness  sharing  (Goldberg  &
	  Richardson, [ICGA87]), the  population  can  be  prevented  from
	  converging  on a single peak, and instead stable SUB-POPULATIONs
	  form at each  peak.  This  is  analogous  to  different  species
	  occupying different niches. See also SPECIES, SPECIATION.

     NO FREE LUNCH:
	  Cocktail party definition:

	  For  any pair of search algorithms, there are "as many" problems
	  for which the first algorithm  outperforms  the  second  as  for
	  which the reverse is true. One consequence of this is that if we
	  don't put any domain knowledge into  our  algorithm,  it  is  as
	  likely  to  perform  worse than random search as it is likely to
	  perform better.  This  is  true  for  all  algorimths  including
	  GENETIC ALGORITHMs.

	  More detailed description:

	  The  NFL  work  of  Wolpert  and  Macready  is  a framework that
	  addresses the core aspects of search, focusing on the connection
	  between  FITNESS  functions and effective search algorithms. The
	  central importance of this connection is demonstrated by the  No
	  Free Lunch theorem which states that averaged over all problems,
	  all search algorithms perform equally. This result implies  that
	  if  we are comparing a genetic algorithm to some other algorithm
	  (e.g., simulated annealing,  or  even  random  search)  and  the
	  genetic  algorithm  performs  better  on some class of problems,
	  then the other algorithm necessarily performs better on problems
	  outside the class. Thus it is essential to incorporate knowledge
	  of the problem into the search algorithm.

	  The NFL  framework  also  does  the  following:  it  provides  a
	  geometric interpretation of what it means for an algorithm to be
	  well matched to a problem;  it  provides  information  theoretic
	  insight  into the search procedure; it investigates time-varying
	  fitness functions; it proves that  independent  of  the  fitness
	  function,   one   cannot   (without   prior   domain  knowledge)
	  successfully  choose  between  two  algorithms  based  on  their
	  previous  behavior;  it  provides a number of formal measures of
	  how well an algorithm performs; and it addresses the  difficulty
	  of OPTIMIZATION problems from a viewpoint outside of traditional
	  computational complexity.

     NORMALLY DISTRIBUTED:
	  A  random  variable  is  NORMALLY  DISTRIBUTED  if  its  density
	  function is described as
	  f(x)    =    1/sqrt(2*pi*sqr(sigma))    *    exp(-0.5*(x-mu)*(x-
	  mu)/sqr(sigma))
	  where mu is the mean of the random variable x and sigma  is  the
	  STANDARD DEVIATION.

 O
     OBJECT VARIABLES:
	  Parameters  that are directly involved in assessing the relative
	  worth of an INDIVIDUAL.

     OFFSPRING:
	  An INDIVIDUAL generated by any process of REPRODUCTION.

     OPTIMIZATION:
	  The process of iteratively improving the solution to  a  problem
	  with respect to a specified objective function.

     ORDER-BASED PROBLEM:
	  A  problem  where  the solution must be specified in terms of an
	  arrangement (e.g. a linear ordering)  of  specific  items,  e.g.
	  TRAVELLING   SALESMAN   PROBLEM,  computer  process  scheduling.
	  ORDER-BASED PROBLEMs are a class of  COMBINATORIAL  OPTIMIZATION
	  problems  in  which  the  entities  to  be  combined are already
	  determined. cf VALUE-BASED PROBLEM.

     ONTOGENESIS:
	  Refers to a single organism, and  means  the  life  span  of  an
	  organism from it's birth to death.  cf PHYLOGENESIS.

 P
     PANMICTIC POPULATION:
	  (EC,  biol)  A  mixed  POPULATION.   A  population  in which any
	  INDIVIDUAL may  be  mated  with  any  other  individual  with  a
	  probability  which  depends  only on FITNESS.  Most conventional
	  evolutionary algorithms have PANMICTIC POPULATIONs.

	  The opposite is a population divided into groups known  as  SUB-
	  POPULATIONs,  where individuals may only mate with others in the
	  same sub-population. cf SPECIATION.

     PARENT:
	  An INDIVIDUAL which takes part in REPRODUCTION to  generate  one
	  or more other individuals, known as OFFSPRING, or children.

     PERFORMANCE:
	  cf FITNESS.

     PHENOTYPE:
	  The expressed traits of an INDIVIDUAL.

     PHYLOGENESIS:
	  Refers  to  a  POPULATION  of  organisms.  The  life  span  of a
	  population of organisms from pre-historic times until today.  cf
	  ONTOGENESIS.

     PLUS STRATEGY:
	  Notation  originally  proposed  in  EVOLUTION STRATEGIEs, when a
	  POPULATION of "mu" PARENTs generates "lambda" OFFSPRING and  all
	  mu  and  lambda  INDIVIDUALs  compete  directly,  the process is
	  written as a (mu+lambda) search.  The process of  competing  all
	  parents  and  offspring  then  is  a "plus strategy." cf.  COMMA
	  STRATEGY.

     POPULATION:
	  A group of INDIVIDUALs which may interact together, for  example
	  by mating, producing OFFSPRING, etc. Typical POPULATION sizes in
	  EC range from 1 (for certain EVOLUTION STRATEGIEs)
	   to  many  thousands  (for  GENETIC   PROGRAMMING).    cf   SUB-
	  POPULATION.

 R
     RECOMBINATION:
	  cf CROSSOVER.

     REORDERING:
	  (EC)  A  REORDERING  operator  is  a REPRODUCTION OPERATOR which
	  changes the order of GENEs in a CHROMOSOME,  with  the  hope  of
	  bringing related genes closer together, thereby facilitating the
	  production of BUILDING BLOCKs.  cf INVERSION.

     REPRODUCTION:
	  (biol, EC) The creation of a new  INDIVIDUAL  from  two  PARENTs
	  (sexual  REPRODUCTION).  Asexual reproduction is the creation of
	  a new individual from a single parent.

     REPRODUCTION OPERATOR:
	  (EC) A mechanism which  influences  the  way  in  which  genetic
	  information  is  passed  on  from  PARENT(s) to OFFSPRING during
	  REPRODUCTION.  REPRODUCTION  OPERATORs  fall  into  three  broad
	  categories: CROSSOVER, MUTATION and REORDERING operators.

     REQUISITE VARIETY:
	  In  GENETIC  ALGORITHMs,  when  the  POPULATION  fails to have a
	  "requisite variety" CROSSOVER will no longer be a useful  search
	  operator  because it will have a propensity to simply regenerate
	  the PARENTs.

     ROBOT:
	  "The Encyclopedia Galactica defines  a  ROBOT  as  a  mechanical
	  apparatus designed to do the work of man. The marketing division
	  of the Sirius Cybernetics Corporation defines a robot  as  `Your
	  Plastic Pal Who's Fun To Be With'."
	  --- Douglas Adams (1979)

 S
     SAFIER:
	  An   EVOLUTIONARY   COMPUTATION  FTP  Repository,  now  defunct.
	  Superceeded by ENCORE.
     SCHEMA:
	  A pattern of GENE values in  a  CHROMOSOME,  which  may  include
	  `dont  care'  states.  Thus  in a binary chromosome, each SCHEMA
	  (plural schemata) can be specified  by  a  string  of  the  same
	  length  as the chromosome, with each character one of {0, 1, #}.
	  A particular chromosome is said to `contain' a particular schema
	  if  it  matches the schema (e.g. chromosome 01101 matches schema
	  #1#0#).

	  The `order' of a schema is the number of non-dont-care positions
	  specified,  while  the `defining length' is the distance between
	  the furthest two non-dont-care  positions.  Thus  #1##0#  is  of
	  order 2 and defining length 3.

     SCHEMA THEOREM:
	  Theorem  devised by Holland [HOLLAND92] to explain the behaviour
	  of GAs.  In essence, it  says  that  a  GA  gives  exponentially
	  increasing   reproductive  trials  to  above  average  schemata.
	  Because each CHROMOSOME contains a great many schemata, the rate
	  of  SCHEMA processing in the POPULATION is very high, leading to
	  a phenomenon known as implicit parallelism. This gives a GA with
	  a  population  of  size  N  a  speedup  by  a factor of N cubed,
	  compared to a random search.

     SEARCH SPACE:
	  If the solution to a task can be represented by a set of N real-
	  valued  parameters, then the job of finding this solution can be
	  thought of as a  search  in  an  N-dimensional  space.  This  is
	  referred  to simply as the SEARCH SPACE.  More generally, if the
	  solution to a task can be  represented  using  a  representation
	  scheme,  R,  then  the  search  space is the set of all possible
	  configurations which may be represented in R.

     SEARCH OPERATORS:
	  Processes used to generate  new  INDIVIDUALs  to  be  evaluated.
	  SEARCH  OPERATORS  in  GENETIC ALGORITHMs are typically based on
	  CROSSOVER and point MUTATION.   Search  operators  in  EVOLUTION
	  STRATEGIEs  and  EVOLUTIONARY  PROGRAMMING typically follow from
	  the representation of a solution and often involve  Gaussian  or
	  lognormal perturbations when applied to real-valued vectors.

     SELECTION:
	  The process by which some INDIVIDUALs in a POPULATION are chosen
	  for REPRODUCTION, typically on the basis of favoring individuals
	  with higher FITNESS.

     SELF-ADAPTATION:
	  The  inclusion  of  a  mechanism  not  only to evolve the OBJECT
	  VARIABLES  of  a  solution,   but   simultaneously   to   evolve
	  information on how each solution will generate new OFFSPRING.

     SIMULATION:
	  The act of modeling a natural process.

     SOFT SELECTION:
	  The  mechanism which allows inferior INDIVIDUALs in a POPULATION
	  a non-zero probability of  surviving  into  future  GENERATIONs.
	  See HARD SELECTION.

     SPECIATION:
	  (biol)  The process whereby a new SPECIES comes about.  The most
	  common cause of SPECIATION is that of geographical isolation. If
	  a SUB-POPULATION of a single species is separated geographically
	  from the main POPULATION for a  sufficiently  long  time,  their
	  GENEs  will  diverge  (either  due  to  differences in SELECTION
	  pressures in different  locations,  or  simply  due  to  GENETIC
	  DRIFT).   Eventually,  genetic differences will be so great that
	  members of the sub-population must be considered as belonging to
	  a different (and new) species.

	  Speciation is very important in evolutionary biology. Small sub-
	  populations can evolve much more rapidly than a large population
	  (because  genetic changes don't take long to become fixed in the
	  population). Sometimes, this  EVOLUTION  will  produce  superior
	  INDIVIDUALs  which  can  outcompete,  and eventually replace the
	  species of the original, main population.

	  (EC) Techniques analogous to geographical isolation are used  in
	  a number of GAs.  Typically, the population is divided into sub-
	  populations, and individuals  are  only  allowed  to  mate  with
	  others  in the same sub-population. (A small amount of MIGRATION
	  is performed.)

	  This  produces  many  sub-populations  which  differ  in   their
	  characteristics,  and may be referred to as different "species".
	  This technique can be useful for finding multiple solutions to a
	  problem, or simply maintaining diversity in the SEARCH SPACE.

	  Most   biology/genetics   textbooks   contain   information   on
	  speciation.  A more detailed account can be found in  "Genetics,
	  Speciation  and  the  Founder  Principle",  L.V.  Giddings, K.Y.
	  Kaneshiro and W.W.  Anderson  (Eds.),  Oxford  University  Press
	  1989.

     SPECIES:
	  (biol)  There  is  no  universally-agreed  firm  definition of a
	  SPECIES.  A species may be roughly defined as  a  collection  of
	  living  creatures,  having  similar  characteristics,  which can
	  breed together to produce  viable  OFFSPRING  similar  to  their
	  PARENTs.   Members  of  one  species  occupy the same ecological
	  NICHE.  (Members of different species may occupy  the  same,  or
	  different niches.)

	  (EC)  In  EC  the  definition  of "species" is less clear, since
	  generally it is always possible for a pair INDIVIDUALs to  breed
	  together.   It  is  probably safest to use this term only in the
	  context  of  algorithms   which   employ   explicit   SPECIATION
	  mechanisms.

	  (biol)  The  existence  of  different  species  allows different
	  ecological niches to be exploited. Furthermore, the existence of
	  a  variety  of different species itself creates new niches, thus
	  allowing room for further species. Thus nature bootstraps itself
	  into almost limitless complexity and diversity.

	  Conversely,  the domination of one, or a small number of species
	  reduces the number of viable  niches,  leads  to  a  decline  in
	  diversity,  and  a  reduction  in  the  ability to cope with new
	  situations.

	  "Give any one species too much rope, and they'll fuck it up"

	  --- Roger Waters, "Amused to Death", 1992

     STANDARD DEVIATION:
	  A measurement for the spread of a set of data; a measurement for
	  the variation of a random variable.

     STATISTICS:
	  Descriptive  measures  of data; a field of mathematics that uses
	  probability theory to gain insight into systems' behavior.

     STEPSIZE:
	  Typically, the average distance in the appropriate space between
	  a PARENT and its OFFSPRING.

     STRATEGY VARIABLE:
	  Evolvable  parameters  that relate the distribution of OFFSPRING
	  from a PARENT.

     SUB-POPULATION:
	  A POPULATION may be  sub-divided  into  groups,  known  as  SUB-
	  POPULATIONs,  where INDIVIDUALs may only mate with others in the
	  same  group.  (This  technique  might  be  chosen  for  parallel
	  processors).  Such  sub-divisions  may  markedly  influence  the
	  evolutionary dynamics of a population (e.g.  Wright's  'shifting
	  balance'  model).  Sub-populations  may  be  defined  by various
	  MIGRATION constraints: islands with limited arbitrary migration;
	  stepping-stones   with   migration   to   neighboring   islands;
	  isolation-by-distance in which each individual mates  only  with
	  near neighbors. cf PANMICTIC POPULATION, SPECIATION.

     SUMMERSCHOOL:
	  (USA)  One  of the most interesting things in the US educational
	  system: class work during the summer break.

 T
     TERMINAL SET:
	  (GP) The set  of  terminal  (leaf)  nodes  in  the  parse  trees
	  representing  the  programs in the POPULATION.  A terminal might
	  be a variable, such as X, a constant value, such as 42, (cf Q42)
	  or a function taking no arguments, such as (move-north).

     TRAVELLING SALESMAN PROBLEM:
	  The  travelling salesperson has the task of visiting a number of
	  clients, located in different cities. The problem to  solve  is:
	  in  what order should the cities be visited in order to minimise
	  the total distance travelled (including returning home)? This is
	  a classical example of an ORDER-BASED PROBLEM.

     TSP: See TRAVELLING SALESMAN PROBLEM.

 U
     USENET:
	  "Usenet  is like a herd of performing elephants with diarrhea --
	  massive, difficult to redirect, awe-inspiring, entertaining, and
	  a  source  of  mind-boggling amounts of excrement when you least
	  expect it."

	  --- Gene Spafford (1992)

 V
     VALUE-BASED PROBLEM:
	  A problem where the solution must be specified in terms of a set
	  of  real-valued  parameters.  FUNCTION OPTIMIZATION problems are
	  of this type.  cf SEARCH SPACE, ORDER-BASED PROBLEM.

     VECTOR OPTIMIZATION:
	  Typically, an OPTIMIZATION problem wherein  multiple  objectives
	  must be satisfied.

 Z
     ZEN NAVIGATION:
	  A  methodology  with  tremendous propensity to get lost during a
	  hike from A to B.  ZEN NAVIGATION  simply  consists  in  finding
	  something  that  looks  as  if  it knew where it is going to and
	  follow  it.   The  results  are  more  often   surprising   than
	  successful, but it's usually being worth for the sake of the few
	  occasions when it is both.

	  Sometimes zen navigation is referred  to  as  "doing  scientific
	  research,"  where  A is a state of mind being considered as pre-
	  PhD, and B (usually a different) state of mind, known  as  post-
	  PhD. While your time spent in state C, somewhere inbetween A and
	  B, is usually referred to as "being a nobody."

ACKNOWLEDGMENTS
     Finally, credit where credit is due. I'd like to thank all the people
     who  helped  in  assembling  this  guide,  and their patience with my
     "variations on English  grammar".  In  the  order  I  received  their
     contributions, thanks to:

 Contributors,
     Lutz  Prechelt  (University  of  Karlsruhe)  the  comp.ai.neural-nets
     FAQmeister, for letting  me  strip  several  ideas  from  "his"  FAQ.
     Ritesh  "peace"  Bansal  (CMU)  for  lots of comments and references.
     David  Beasley  (University  of  Wales)  for  a  valuable   list   of
     publications  (Q12),  and many further additions.  David Corne, Peter
     Ross,  and  Hsiao-Lan  Fang  (University  of  Edinburgh)  for   their
     TIMETABLING  and  JSSP  entries.   Mark  Kantrowitz (CMU) for mocking
     about this-and-that, and being a "mostly valuable" source  concerning
     FAQ  maintenance;  parts  of  Q11  have  been stripped from "his" ai-
     faq/part4 FAQ; Mark also contributed the less verbose archive  server
     infos.   The  texts  of  Q1.1,  Q1.5, Q98 and some entries of Q99 are
     courtesy by James  Rice  (Stanford  University),  stripped  from  his
     genetic-programming  FAQ  (Q15).   Jonathan  I. Kamens (MIT) provided
     infos on how-to-hook-into the USENET FAQ  system.   Una  Smith  (Yale
     University)  contributed  the  better parts of the Internet resources
     guide  (Q15.5).   Daniel   Polani   (Gutenberg   University,   Mainz)
     "contributed"  the  ALIFE  II  Video  proceedings  info.   Jim  McCoy
     (University of Texas) reminded me of  the  GP  archive  he  maintains
     (Q20).   Ron Goldthwaite (UC Davis) added definitions of Environment,
     EVOLUTION, Fitness, and Population to the glossary, and some thoughts
     why   Biologists  should  take  note  of  EC  (Q3).   Joachim  Geidel
     (University of Karlsruhe) sent a diff of the  current  "navy  server"
     contents  and the software survey, pointing to "missing links" (Q20).
     Richard Dawkins "Glossary" section of "The extended phenotype" served
     for many new entries, too numerous to mention here (Q99).  Mark Davis
     (New  Mexico  State  University)  wrote  the  part  on   EVOLUTIONARY
     PROGRAMMING  (Q1.2).   Dan Abell (University of Maryland) contributed
     the section on efficient greycoding (Q21).  Walter Harms  (University
     of  Oldenburg)  commented  on introductory EC literature.  Lieutenant
     Colonel J.S. Robertson (USMA, West Point), for providing a  home  for
     this      subversive      posting     on     their     FTP     server
     euler.math.usma.edu/pub/misc/GA Rosie O'Neill for suggesting the  PhD
     thesis entry (Q10.11).  Charlie Pearce (University of Nottingham) for
     critical remarks  concerning  "tables";  well,  here  they  are!   J.
     Ribeiro Filho (University College London) for the pointer to the IEEE
     Computer GA  Software  Survey;  the  PeGAsuS  description  (Q20)  was
     stripped  from  it.  Paul Harrald for the entry on game playing (Q2).
     Laurence  Moran  (Uni  Toronto)  for  corrections  to  some  of   the
     biological  information  in  Q1  and  Q99.   Marco  Dorigo (Uni Libre
     Bruxelles) gets the award for reading the guide more thoroughly  than
     (including  the  editors). He suggested additions to Q1.4, Q2, Q4 and
     Q22, and pointed out various typos.  Bill Macready (SFI) for the  two
     defintions  of  the  NFL  theorem in Q99.  Cedric Notredame (EBI) for
     providing information about  applications  of  EC  in  biology  (Q2).
     Fabio Pichierri (The Institute of Physical and Chemical Research) for
     information on the relevance of EC to chemists  (Q3).   Moshe  Sipper
     (Swiss  Federal  Institute of Technology) for details of applications
     in Cellular Automata and Evolvable Hardware (Q2).

 Reviewers,
     Robert Elliott Smith (The University of Alabama)  reviewed  the  TCGA
     infos  (Q14),  and Nici Schraudolph (UCSD) first unconsciously, later
     consciously, provided about 97% of Q20* answers.  Nicheal Lynn Cramer
     (BBN)  adjusted my historic view of GP genesis.  David Fogel (Natural
     SELECTION, Inc.) commented and helped on this-and-that  (where  this-
     and-that is closely related to EP), and provided many missing entries
     for the glossary (Q99).  Kazuhiro M. Saito (MIT) and Mark D.  Smucker
     (Iowa  State)  caught my favorite typo(s).  Craig W. Reynolds was the
     first who solved one of the well-hidden puzzles in the FAQ, and  also
     added  some  valuable  stuff.   Joachim  Born (TU Berlin) updated the
     EVOLUTION Machine (EM) entry and provided the pointer to the  Bionics
     technical  report  FTP  site  (Q14).   Pattie  Maes  (MIT  Media Lab)
     reviewed the ALIFE IV additions to the  list  of  conferences  (Q12).
     Scott  D.  Yelich  (Santa Fe Institute) reviewed the SFI connectivity
     entry (Q15).  Rick Riolo (MERIT)  reviewed  the  CFS-C  entry  (Q20).
     Davika  Seunarine  (Acadia  Univ.)   for smoothing out this and that.
     Paul Field (Queen Mary and Westfield College) for  correcting  typos,
     and providing insights into the blindfold pogo-sticking nomads of the
     Himalayas.

 and Everybody...
     Last not least I'd like to thank Hans-Paul  Schwefel,  Thomas  Baeck,
     Frank  Kursawe, Guenter Rudolph for their contributions, and the rest
     of the Systems Analysis Research Group for wholly remarkable patience
     and almost incredible unflappability during my various extravangances
     and ego-trips during my time (1990-1993) with this group.

		      It was a tremendously worthwhile experience. Thanks!
     --- The Editor, Joerg Heitkoetter (1993)

EPILOGUE
			  "Natural selection is a mechanism for generating
			     an exceedingly high degree of improbability."

				  --- Sir Ronald Aylmer Fisher (1890-1962)

     This is a GREAT quotation, it sounds like something directly out of a
	turn of the century Douglas Adams: Natural selection: the original
					    "Infinite Improbability Drive"

		  --- Craig Reynolds (1993), on reading the previous quote

     `The Babel fish,' said The Hitch Hiker's Guide to the Galaxy quietly,
     `is  small,  yellow  and leech-like, and probably the oddest thing in
     the Universe.  It feeds on brainwave energy received not from his own
     carrier  but  from those around it. It absorbs all unconscious mental
     frequencies from this brainwave energy to nourish  itself  with.   It
     then excretes into the mind of its carrier a telepathic matrix formed
     by combining the conscious thought  frequencies  with  nerve  signals
     picked  up  from  the  speech centers of the brain which has supplied
     them.  The practical upshot of all this is that if you stick a  Babel
     fish in your ear you can instantly understand anything said to you in
     any form of language. The speech patterns you  actually  hear  decode
     the  brainwave matrix which has been fed into your mind by your Babel
     fish.  `Now it  is  such  a  bizarrely  improbable  coincidence  than
     anything so mindbogglingly useful could have evolved purely by chance
     that some thinkers have chosen to see it as  a  final  and  clinching
     proof of the non-existence of God.  `The argument goes something like
     this: ``I refuse to prove that  I  exist,''  says  God,  ``for  proof
     denies  faith,  and without faith I am nothing.''  ``But,'' says Man,
     ``The Babel fish is a dead giveaway isn't  it?   It  could  not  have
     evolved by chance. It proves you exist, and so therefore, by your own
     arguments, you don't. QED.''   ``Oh  dear,''  says  God,  ``I  hadn't
     thought  of  that,'' and promptly vanishes in a puff of logic.  ``Oh,
     that was easy,'' says Man, and for an encore goes on  to  prove  that
     black is white and gets himself killed on the next zebra crossing.

						  --- Douglas Adams (1979)

     "Well, people; I really wish this thingie to turn into a paper babel-
     fish for all those young ape-descended organic  life  forms  on  this
     crazy  planet,  who don't have any clue about what's going on in this
     exciting  "new"  research  field,  called  EVOLUTIONARY  COMPUTATION.
     However,  this  is  just  a  start,  I need your help to increase the
     usefulness of this guide, especially  its  readability  for  natively
     English  speaking  folks;  whatever  it  is:  I'd  like  to hear from
     you...!"

				  --- The Editor, Joerg Heitkoetter (1993)

	       "Parents of young organic life forms should be warned, that
       paper babel-fishes can be harmful, if stuck too deep into the ear."

						--- Encyclopedia Galactica

     "The meeting of these guys was definitely the best bang since the big
     one."

     --- Encyclopedia Galactica

ABOUT THE EDITORS
 Joerg Heitkoetter,
     was born in 1965 in Recklinghausen, a small but  beautiful  750  year
     old town at the northern rim of the Ruhrgebiet, Germany's coal mining
     and   steel   belt.    He   was   educated   at    Hittorf-Gymnasium,
     Recklinghausen,   Ruhruniversitaet   Bochum  (RUB)  and  Universitaet
     Dortmund (UNIDO), where he  read  theoretical  medicine,  psychology,
     biology, philosophy and (for whatever reason) computer sciences.

     He volunteered as a RA in the Biomathematics Research Group from 1987
     to  1989,  at  the  former   ``Max-Planck-Institute   for   Nutrition
     Physiology,''  in  Dortmund (since March 1, 1993 renamed to ``MPI for
     Molecular Physiology''), and spent 3 years at the ``Systems  Analysis
     Research  Group,''  at  the  Department of Computer Science of UniDO,
     where  he  wrote  a  particularly  unsuccesful  thesis  on   LEARNING
     CLASSIFIER  SYSTEMs.  In 1995, after 22 semesters, he finally gave up
     trying to break Chris Langton's semester record, and dropped  out  of
     the  academic  circus.  Amazingly, he's a fellow at EUnet Deutschland
     GmbH's Fun & Games division  (in  more  traditional  words:  the  R&D
     department), currently working on various things in parallel. You may
     visit   his   homepage   for    a    mostly    complete    list    at
     http://alife.santafe.edu/~joke/

     His  electronic  publications  range  from  a voluntary job as senior
     editor of the FAQ in Usenet's comp.ai.genetic newsgroup, entitled The
     Hitch-Hiker's  Guide  to  Evolutionary  Computation,  over many other
     projects he helped bootstrapping, for example Howard Gutowitz' FAQ on
     Cellular  Automata, available on USENET via comp.theory.cell-automata
     ,to about a dozen of so-called ``multimediagrams'' written  in  HTML,
     the  language  that  builds  the World-Wide Web. The most useful ones
     being ENCORE, the EVOLUTIONARY COMPUTATION  Repository  Network  that
     today,  after  several years of weekend hacking, is accessible world-
     wide. And the latest additions: Zooland, the definite  collection  of
     pointers to ARTIFICIAL LIFE resources on the 'net; And Webland a KISS
     model of the Internet at large, including ``guided tours'' across the
     myriads of info-bits out there.

     With  Adam  Gaffin, a former senior newspaper reporter from Middlesex
     News, Boston, MA, who is now with Networks World, he edited the  most
     read book on Internet, that was launched by a joined venture of Mitch
     Kapor's Electronic Frontier Foundation (EFF) and the  Apple  Computer
     Library,  initially  called  Big Dummy's Guide to the Internet it was
     later renamed to EFF's (Extended) Guide to the Internet: A round trip
     through  Global  Networks,  Life in Cyberspace, and Everything...  If
     you want to find it, just fire  up  Netscape  and  select  About  the
     Internet  from  the  Directory  menu...voila! (which means: there you
     go!)

     Since a very special event, he  has  severe  problems  to  take  life
     seriously,   and   consequently   started   signing  everything  with
     ``-joke'', while developing a liquid  fixation  on  all  flavours  of
     whiskey.  He  continues to write short stories, novels and works on a
     diary-like lyrics collection  of  questionable  content,  entitled  A
     Pocketful  of Eloquence, which recently was remaned to Heartland, and
     published on the web as: http://surf.de.uu.net/heartland/

     He likes Mickey Rourke's movies (especially Rumblefish  and  Barfly),
     Edmund  Spenser's medieval poetry, McDonald's Hamburgers, diving into
     the analysis of complex systems of any kind, (but prefers  the  long-
     legged  ones) and the books by Erasmus of Rotterdam, Robert Sheckley,
     Alexei Panshin, and, you name it, Douglas Adams.

     Due to circumstances he leads a life on the edge, is usually  single,
     would love to have children, but has none (he'd know of), has no time
     to get married, and still doesn't live in Surrey.   He  is  known  to
     reject  job  offers that come bundled with Porsches and still doesn't
     own a BMW Z3 roadster, for he recently purchased a 1996 MGF.

  NOTABLE WRITINGS
     Nothing really worth listing here.

 David Beasley,
     was born in London, England in 1961. He was educated  at  Southampton
     University where he read (for good reasons) Electronic Engineering.

     After  spending  several  years  at sea, he went to the Department of
     Computing Mathematics of the University of Wales, Cardiff,  where  he
     studied  ARTIFICIAL INTELLIGENCE for a year. He then went on to write
     a thesis on GAs applied to Digital Signal Processing,  and  tried  to
     break Joke's publications record.

     Since  a  very special event, he has taken over writing this FAQ, and
     consequently started signing everything with ``The FAQmaster''  (He's
     had  severe problems taking life seriously for some time before that,
     however.) He likes Woody Allen's movies, English clothing of medieval
     times, especially Marks and Spencer, hates McDonald's Hamburgers, but
     occasionally dives into the analysis of complex systems of any  kind,
     (but  prefers  those with pedals and handlebars) and the books by (of
     course) Douglas Adams.
     He is not married, has no children, and also  also  doesn't  live  in
     Surrey.

     He  now  works  for a (mostly interesting) software company, Praxis (
     http://www.praxis.co.uk/ ) in Bath, England.

  NOTABLE WRITINGS
     A number of publications related to  GENETIC  ALGORITHMs.   The  most
     notable ones being:

     A  Sequential  Niche  Technique for Multimodal Function Optimization,
     Evolutionary Computation, 1(2)  pp  101-125,  1993.   Available  from
     ralph.cs.cf.ac.uk:/pub/papers/GAs/seq_niche.ps

     Reducing  Epistasis in Combinatorial Problems by Expansive Coding, in
     S. Forrest (ed), Proceedings of the Fifth International Conference on
     Genetic  Algorithms,  Morgan-Kaufmann,  pp  400-407, 1993.  Available
     from ralph.cs.cf.ac.uk:/pub/papers/GAs/expansive_coding.ps

     An Overview of Genetic Algorithms: Part 1,  Fundamentals,  University
     Computing,  15(2)  pp 58-69, 1993.  Alailable from ENCORE (See Q15.3)
     in        file:        GA/papers/over93.ps.gz         or         from
     ralph.cs.cf.ac.uk:/pub/papers/GAs/ga_overview1.ps

     An   Overview   of  Genetic  Algorithms:  Part  2,  Research  Topics,
     University Computing, 15(4) pp 170-181, 1993.  Available from  Encore
     (See    Q15.3)    in    file:    GA/papers/over93-2.ps.gz   or   from
     ralph.cs.cf.ac.uk:/pub/papers/GAs/ga_overview2.ps

			       THAT'S ALL FOLKS!

	"And all our yesterdays have lighted fools the way to dusty death;
		       out, out brief candle; life's but a walking shadow;
	      a poor player that struts and frets his hour upon the stage;
					       and then is heared no more;
					   it is a tale; told by an idiot,
						   full of sound and fury,
						      signifying nothing."

						  --- Shakespeare, Macbeth

------------------------------

     Copyright (c) 1993-1997 by J. Heitkoetter and D. Beasley, all  rights
     reserved.

     This  FAQ  may be posted to any USENET newsgroup, on-line service, or
     BBS as long as it  is  posted  in  its  entirety  and  includes  this
     copyright  statement.   This FAQ may not be distributed for financial
     gain.  This FAQ may not be  included  in  commercial  collections  or
     compilations without express permission from the author.

End of ai-faq/genetic/part6
***************************



Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - MultiPage

------------------------------------------------
[ By Archive-name | By Author | By Category | By Newsgroup ]
[ Home | Latest Updates | Archive Stats | Search | Usenet References | Help ]

------------------------------------------------

Send corrections/additions to the FAQ Maintainer:
David.Beasley@cs.cf.ac.uk (David Beasley)

Last Update December 18 1997 @ 02:12 AM

faq-admin@faqs.org