Code Generation


First, the parser converts the input source program into an intermediate form, called intermediate code.  Then the code generator takes this intermediate code and generates a machine code.  For our purpose, we define S-code as the target intruction set.

S-code

S-code is a stack-based instruction set.  It is somewhat different from common processor instruction (typify by Intel instruction set).  While a common instruction set today is a three-address instruction, S-code is zero-address instruction.  The main working storage for S-code is a stack.  It is called "evaluation stack" and is comparable to the registers in a processor. A stack has two operations on its data: push and pop.  So, it is not necessary to "address" the stack (hence the name "zero-address"). 

For example, to perform an add operation, two arguments are pushed to the stack and the operator "add" is executed.  The operator "add" takes two arguments from the stack, adding them and pushes the result back. To store local variables, S-code employs a data structure called "activation record". It is a structure that exists only when a function is called and it is deleted when a function is terminated (that is why it is a "local" data).  An evaluation stack is "global", it exists from the beginning of the execution of a program until the program terminates.

Read detailed description of S-code here:   http://www.cp.eng.chula.ac.th/faculty/pjw/project/som/s-code.htm

Code generation scheme

the input is the parse tree
the output is S-code  (object file display on the screen)

To "run" the S-code, we need to have a virtual machine (Som virtual machine) or an actual processor (SX chip).  We will not concern how to "execute" S-code at the moment.

Table 4.1  Mapping between the parse tree and S-code

parse tree                  s-code

number                  Lit.a
load local              Get.a
store local             Put.a
load global             Ld.A
store global            e St.A
(+ e1 e2)               e1 e2 Add              
(vec a e)               e Get.a Ldx
(= (vec a e) v)         e v Get.a Stx
(vec A e)               e Ld.A Ldx
(= (vec A e) v)         e v Ld.A Stx
(fun.a.v e)             Fun.k e Ret.m
                        where k = v-a+1, g = v+1
(call a e...)           e ... Call.a
(if e1 e2 e3)
                        e1
                        Jf F
                        e2
                        Jmp E
                     F: e3
                     E:
(while e1 e2)   
                     L: e1

                        Jf E
                        e2
                        Jmp L
                     E:
    or better
                        Jmp I
                     L: e2
                     I: e1
                        Jt L
(do e1 e2 ...)         e1  e2


Here are the examples of generating S-code. 

1  Simple function

sq(x){ return x * x; }

(fun sq (return (* #1 #1 )))   parse tree  

1 fun.1.1                     S-code
2 get.1
3 get.1
4 mul
5 ret

2   Simple assignment

assign(){ a = b + 1; }

(fun assign (= #1 (+ #2 1 )))

1 fun.0.2
2 get.2
3 lit.1
4 add
5 put.1
6 ret

3  Control flow

control(){
   while (i < 10)
       i = i + 1;
   if (j == 2)
       k = 20;
   else
       k = 10;
}

(fun control (do (while (< #1 10 )(= #1 (+ #1 1 )))(else (== #2 2 )
  (= #3 20 )(=
#3 10 ))))

1 fun.0.3
2 get.1
3 lit.10
4 lt
5 jf.11 <out>
6 get.1
7 lit.1
8 add
9 put.1
10 jmp.2 <loop>
11 get.2
12 lit.2
13 eq
14 jf.18 <else>
15 lit.20
16 put.3
17 jmp.20 <exit>
18 lit.10
19 put.3
20 ret

4  Global variables and indexing
ax[10], g;
simple(){
    g = 1000;              // global
    a = g;
    b = 11;
    ax[1] =  20;
    b = ax[1];
}


(fun simple (do (= g 1000 )(= #1 g )(= #2 11 )(= (vec ax 1 )20 )
  (= #2 (vec ax 1))))

ax    at 2000
g     at 2010

  fun.0.2
  lit.1000
  st.1
  ld.2010 
  put.1
  lit.11
  put.2

  lit.1
  lit.20
  ld.2000
  stx
  ld.2000
  lit.1
  ldx
  put.2
  ret


last update 10 Aug 2011