Genetic Algorithms

Genetic Algorithm (GA) is a class of methods that search for solutions by iteatively improving the set of solutions at hands.  The initial set of solutions can be randomly generated.  It can be applied to a large class of problems. 

GA is motivated by natural genetics.  It "evolves" set of solutions by "selecting" good solutions and "recombining" them to create better
solutions. The guideline for selection is based on "fitness" measure. 

Pseudo code of GA is as follow:

p = initial population (set of solutions)
while not terminating
  p' = select good solutions from p
  p'' = recombine p'
  p = p''


Mutation can be introduced after the recombination.

To illustrate a concrete algorithm, here is a version of GA called Simple Genetic Algorithm:

1)  each solution is represented by a binary string {0,1}*
2)  selection is fitness-proportion, the better one has higher chance to be selected.
3)  recombination by single-point crossover.
4)  mutation by flipping a random bit.


GA searches by jumping through hyperplane determined by the "schema". A schema represents a group of solution.  A schema consists of {0,1,*} where * is a don't care symbol represented both 0 and 1.  For example, a set of solution of length 4 begin with 1 is written as 1***. 

Using GA to solve a problem follows these steps:

1)  Determine how a solution is encoded into a binary string.
2)  Choose, selection method, recombination method.
3)  Set the following parameters:
    3.1  the size of population (the set of solutions)
    3.2  the percent of population that will be recombine
    3.3  the percent of mutation

To run a GA, the "fitness" function must be defined.  This is problem-dependent.  This function is a measure that distinguish the better
solution in the population.  The following example demonstates how to use GA.

Example.  Find a solution of f(x) = x^3 + 3x^2 + 1.

Let use a 32-bit binary string to represent a solution. Use Simple GA, population size = 1000, recombination 80%, mutation 1%, the number of generation before giving up is 200.  The f(x) can be used as the fitness function.  This is a minimization problem (find x such that f(x) = 0), the lower the fitness, the better.

Genetic operations

    Selection

The simplest kind of selection is a fitness-proportion.  The chance of a solution being selected depends of its fitness.  There are many other methods such as Ranking where the whole population is sorted according to the fitness value then assigning the probability to this rank.  Tournament selection is another method.  Two candidates are sampling from the population and the fitter one will be select.  The size of the tournament can be larger than 2.  The larger size has more "selection pressure".  Diversity can also be influent in the selection method.  By combining diversity measure with fitness, the selection will pick the solution that is both higly fit and is different from other in the set.

    Recombination

Beside single-point crossover, multiple point is also possible.  Multiple parents can also be used.  Any recombination method must consider that it must produce only "valid" offspring.  For example, on Travelling Saleman Problem (TSP) the "tour" is represented as a  string of numbers that are unique (no duplicate in the same string).  This representation reflects the nature of the problem that it is "combinational" (that is, the solution is a permutation of these numbers).  Therefore using a normal single-point crossover will produce "invalid" string.  See the follow pictures:

tour 1:   12345678
tour 2:   45673218


If they are crossed at the middle (single-point crossover) the offspring will be

tour 1':  1234|3218
tour 2':  4567|5678


and they are both invalid.  This kind of problem needs special crossover operators such as OCX, PMX that preserve the tour.  (search the literature for more details).

Application of GAs

(coming soon)

last update 28 Nov 2012