Introduction to Genetic Algorithms
Lecture 10 Design and Analysis of Algorithms 2002
Prabhas Chongstitvatana 12 September 2002
Motivation
What is Genetic Algorithms
Representation
Evaluation
Selection
Recombination
Mutation
Schema theory
Summary
References
Motivation
How to write a program to solve a hard problem? For a large class
of problems, there are known algorithms to solve them. Many of these
algorithms have analysis of their complexity, hence we can choose the appropriate
algorithm for the problem. However, many interesting problems do
not have any known algorithm to solve them. Sometimes an obvious
algorithm to solve the problem is not effective (taking too long time for
the size of problem interested). In this case, if the problem is
well studied, the knowledge about its structure can be exploited to solve
it. This class of problem is suitable for heuristics. Problems
in artificial intelligence (the science of studying how to make a machine
think) belongs to this class. Heuristics is not easy (as anybody
who work in AI will tell you). A lot of time it is also not obvious.
It is often the case that only experts will be able to come up with good
heuristics for the problem. If you are not an expert in that particular
problem which has no known algorithm to solve it, what will you do?
A class of algorithms named Evolutionary Algorithms (EA) is suitable
for such situation. EA has been evolving to solve many hard real
world problems since 1960. Large body of work has been accumulated
in the past 20 years. This class of algorithms employs randomization.
The most well known method in this class is Genetic Algorithms (GA) invented
by John Holland of University of Michigan in 1972. The idea behind
GA is very elegant. The whole process can be explained quite easily
to any non-expert. However, it is not easy to explain why it
does work.
EA belongs to weak search methods. The algorithm (in its general
form) is not dependent to the problem. That means it will work for
any problem. (You should be skeptic that it should not work well
for all of them! That is the reason why it is call weak search method).
As opposed to strong search methods that are problem dependent. In
general, for a problem that exists a strong search method, a weak search
method will not be as efficient (but the weak search method should be easier
to set up). There is a theory called No Free Lunch (NFL) states that
it is not possible to find an algorithm that works well for all problems
(seem obvious! but nonetheless).
What is Genetic Algorithms
GA is a search method inspired by natural evolution. Charles Darwin
stated in his book "Origin of Species" (18XX) that "the strongest animal
will survive and has offspring, the weak animal will die" < *** must
use the exact quote > GA search method is based on population of
solutions. Initially, these solutions are random, the solutions are
evaluated for their "fitness" (their merits of being closed to the answer).
The "good" solutions will be selected to be parents of the next generation.
These selected parents will be improved using so called genetic operators
and the results become the next generation population. This cycle
is repeated until the answer is found or some criteria is met (the best
solution is good enough or the time limit is exceeded).
solutions <--
|
|
v
|
evaluation |
|
|
v
|
improvement -
Genetic operators are: selection, recombination and mutation. There
are many variation of these operators. The evalution depends on the problem.
Sometimes the evaluation is accurate, sometimes it can only give an approximation
or even a guidance how a solution compared to others. This is the
only part of the algorithm that depends on the problem.
To see how GA solves a problem, the starting point is how to encode
solutions in population. I will explain a particular method in GA
called Simple Genetic Algorithm (SGA)(as explained in Goldberg 1982)
which is a reference method in this field.
Representation
A solution is represented by a binary string {0,1}*. The length of
string depends on the problem. For example, for a sorting problem,
let each element be in the range 0..255 and the number of element is 100
then a solution can be represented by a binary string of 8-bit x 100 =
800 bits. For an 8-queen problem which encodes a solution as a vector
of column number (such as 1,3,7,6,2,5,4,8 ), a solution can be encoded
as 3-bit x 8 = 24 bits. For a subset selection problem, a solution
is already a binary string (1-select, 0-not select). The length of
solution affects the time to find the answer. It is conjectured that
for SGA, the running time is in big-oh(exp the length of solution).
< *** must check this fact and give reference>.
Evaluation
The evaluation of the goodness of a solution depends on the problem.
For a sorting of ascending order, the number of times that two numbers
is in ascending order can be used. The higher number the better,
for n elements the highest score is n. The number of attacking queen
can be used for the 8-queen problem. In this case, the lower number
is better, the best is 0.
Selection
The selection in SGA is called fitness proportion selection. Each
solution is selected according a probability depends on its goodness.
There are many ways to scale this probability distribution.
Recombination
The recombination operator works to combine two solutions to form two new
solutions. (or mixing the gene). In SGA, the recombination
is one-point crossover. It works by selecting a random point in the
string, cut both parents at that point and exchange the substring.
For example:
123|4567 ==> 123|defg
abc|defg
abc|4567
It is conjectured that there are substring of solution that is good (they
are in the answer), they are called Building Block. The recombination
operator "assembles" them together. This conjecture is called Building
Block Hypothesis (BBH). <give reference>
Mutation
The mutation operator in SGA is called a single-point mutation. It
works by flipping one random bit in a solution. The mutation introduces
new gene into the gene pool. This is to avoid getting stuck at "local
minima", that is, when a search can be proceed any further due to no local
improvement is possible. Mutation affects a kind of "jump" out of
it. Mutation is nature almost always fatal. However, nature
has protection against mutation using redundancy, not all genomes are expressed
(taking the effect). When mutation occurs at non-expressed gene it
is not fatal.
A model of behavior of SGA: Schema theory
To understand GA as a dynamic process, it can be modelled by considering
the change in the number of gene through time. Schema theory offers
such explanation. A schema is a pattern {0,1,*}* when '*' means either
0 or 1. A schema captures the concept of set of strings, for example,
the schema '1*****' is a set of all strings of lenght 6 that begin
with 1. The schema '**100' is a set of all strings of length
5 that end with 100. The bit that is 0 or 1 is called "fixed". The
number of fixed-bit is defined as the order of schema, oh(.), for example
oh('1**0') = 2. The length of a schema is the difference
between the first fixed-bit and the last fixed-bit, del(.), for example
del('*1001') = 3, del('0***') = 0. The number of
bit string represented by a schema is denoted by m(H) where H is a schema.
Schema theory captures the change of m(H) by the genetic operators.
First, consider only the selection.
m(H,t+1) >= m(H,t) f(H)/f* (1)
where f is the fitness of H and f* is the average fitness of the whole
population. If the fitness of a schema is higher than the average
then its number will increase in the next generation and vice versa.
To incorporate the recombination operator into the equation, consider the
destructive effect of the one-point crossover. If the crossover point
is "in" the schema that schema will be destroyed. The probability
of this to happen is:
del(H)/(l-1)
where l is the length of schema
For example if all bit is fixed, the probability of being destroy is
1 (always). If one bit is fixed the probability of being destroy
is 0 (never). The larger (higher del) schema has more chance of being
destroyed. Of course, it is possible that the recombination operator
will create a new string of a schema. However, it is of second-order
effect and we ignore it here. More exact analysis is possible but
not in the scope of this course. Incorporate this into eq.1
using Pc as probability of using recombination operator.
m(H,t+1) >= m(H,t) f(H)/f* Pc [1 - del(H)/(l-1) ]
(2)
Now consider the effect of mutation. For one-bit mutation the
probability of a schema being destroy depends on its order oh(H).
The more fixed-bit the higher chance that the schema will be destroy by
mutation. Pm is probability of using mutation operator.
m(H,t+1) >= m(H,t) f(H)/f* Pc [1 - del(H)/(l-1) - Pm oh(H)]
(3)
This final eq.3 shows that a short (small del), fitter than average
schema will survive and increase its number exponentially. (it is
exponential because the increase is similar to compound interest, it is
increasing by (1 + p)^t where p is the proportion of increment 0..1 and
t is the number of iteration). Such schema is the Building Block.
There is another analysis in game theory showing that exponential increasing
is the best policy for the problem of exploration vs exploitation.
In the context of GA, initially the effort is spending is exploration,
to find out which schema is good. Then, the exploitation is followed
by using that schema over and over to gain the fitness.
Please note that eq. 3 is equality. It is not exact because there
are other second-order effects that are not considered. Presently,
the exact schema theory for SGA exists. <ref. Vose text book>
Summary
In summary, GA is one answer for solving a hard problem that no known algorithm
exists. It is also appropriate to use when no expert is available
to adjust complex parameters required by heuristic algorithms for that
problem. GA is also easy to implement, as in general form it is so
simple and can be understood by non computer science person. However,
GA usually consumes large computation resource and not guarantee to give
an answer. GA is suitable for two kinds of task:
-
innovation -- for the problem in design, to find the best combination
of components given constraints.
-
optimization -- this is largely in the domain where many optimization exist
for problems, but there is always new problem that has no other technique
available.
The innovation aspect is quite intriguing and will be the main force for
using GA and EA in general in the future.
Future revision: To be added
hyperplane explanation of search behaviour of GA, some graphical interpretation,
implicit parallelism, K-arm bandit theory.
References
(to be added)
14 September 2002
P. Chongstitvatana