An Improved Genetic Algorithm for the Inference of Finite State Machine

N. Niparnan and P. Chongstitvatana
Chulalongkorn University

Present at Int. Conf. on System, Man and Cybernetics, Tunisia, 6-9 October, 2002.

time 15 min. question & answer 5 min.

<intro>
Good afternoon everybody.  I came from Chulalongkorn University, Thailand.  I am going to describe my work on the inference of FSMs.  I will explain what the problem is:

<page 1>
Here is an unknown FSM, only its input and output can be observed.  "Inference" means a learning process that "infer" the structure of FSM from its input/output.  Because the structure of the target machine is not known, we can never be sure that the inferred machine is the same as the target.  Only that its input/output is consistent with the target machine.

<page 2>
This problem is a hard problem.  In fact, it is proven to be NP-complete.  Genetic Algorithms are among the popular method to solve this problem.  We propose a new and efficient GA for this problem.  The problem statement is as follows:

Given the input/output sequence set which is randomly generated from a target machine, find a Mealy machine that is consistent with this set.

A Mealy machine is the machine that its output depends on its state.

<page 3>
The basic approach is to use GA to evolve the machine.  We need to encode the transition function and the output function.  The fitness function for GA is the number of similar bit of output between the evolved machine and the target machine.  The input sequence is fed into the FSM under evolution and its output is compared to the output of the target machine.

The highest score occurs when every bits of both machines are the same.  This scheme is reasonable but it is not very effective.

<page 4>
See this simple example:  these two machines have similar structures, however the output of one machine is compliment of the other.  Using the simple fitness function that compare these two machines will result in the fitness value zero.  This observation suggests that inferring the structure should be separated from inferring the output.

<page 5>
We propose a new method as follows:  
Only the structure of the machine is evolved, the output function is assigned.  Therefore the encoding contains only the transition function.  The method to define the output function will be explained along with the modified fitness function.

<page 6>
Assignment of the output is best illustrated by a simple example:  here is a machine with 2 states.  The input is labelled zero, one.  The output of this machine is assigned to be most consistent with the given input/output sequence. We count the number of occurrence of each output, for example, there are three occurrences of input 0 / output 0 at state X.

<page 7>
Now let us label the output of each transition a, b, c, d and from this count we assign the output to the maximum count.

a)  no conflict between output 0 and output 1, assign 0
b)  no count, any value can be assigned
c)  assign 1
d)  there is a conflict, the max. count is at the output 0, so assign 0.

Assigning the output this way we will have a machine that is most similar to the target.  The sum of the assigned count becomes the fitness value of that machine.  In this example the fitness value is 6.

This fitness function reflects the fact that for non-conflicting transition the frequency of use is rewarded.  For conflicting transition, the degree of conflict affects the fitness as the count is shared among the conflicts, it is rewarded lower score.  

<page 8>
In summary the improvement of the new method is two folds:  
one, it improves the evaluation, the fitness function is more accurate in guiding the search.
two, it reduces the effort in evolution as the output is not evolved.

<page 9>
We performed the experiment on three sets of randomly generated FSMs.  Shown in this table.  They are divided according to their size into three groups:  small, medium and large with different number of outputs.  The size of evolving FSM is given to be slightly larger than the target machine.

Each FSM has 20 input/output sequences of length 30.  Result in 600 pairs of input/output sequences.
To run GA we use the following parameters: population size 100, we use single point crossover with probability 0.5, the mutation rate is 0.1.  Each experiment is repeated 10 times.  We are comparing the simple GA with the proposed method.

<page 10>