Code Generation


< read my textbook Chapter 4>

First, the parser converts the input source program into an intermediate form, called intermediate code.  Then the code generator takes this intermediate code and generates a machine code.  For our purpose, we define S-code as the target intruction set.

S-code

S-code is a stack-based instruction set.  It is somewhat different from common processor instruction (typify by Intel instruction set).  While a common
instruction set today is a three-address instruction, S-code is zero-address instruction.  S-code is chosen because it is well-matched to N-code which simplify the task of code generation.The main working storage for S-code is a stack.  It is called "evaluation stack" and is comparable to the registers in a
processor. A stack has two operations on its data: push and pop.  So, it is not necessary to "address" the stack (hence the name "zero-address"). 

For example, to perform an add operation, two arguments are pushed to the stack and the operator "add" is executed.  The operator "add" takes two
arguments from the stack, adding them and pushes the result back. To store local variables, S-code employs a data structure called "activation record".
It is a structure that exists only when a function is called and it is deleted when a function is terminated (that is why it is a "local" data).  An evaluation stack is "global", it exists from the beginning of the execution of a program until the program terminates.

Read detailed description of S-code here:   http://www.cp.eng.chula.ac.th/faculty/pjw/project/som/s-code.htm

Code generation scheme

the input is  N-code  (object file of the parser)
the output is S-code  (object file display on the screen)

To "run" the S-code, we need to have a virtual machine (Som virtual machine) or an actual processor (SX chip).  We will not concern how to "execute" S-code at the moment.

Table 4.1  Mapping between N-code and S-code

n-code                  s-code

xLIT.a                  icLit.a
xGET.a                  icGet.a
xPUT.a                  icPut.a
xLD.a                   icLd.a
(xADD e1 e2             e1 e2 icAdd
(xST.a e)               e icSt.a
(xLDX.a e)              e icGet.a icLdx
(xSTX.a e v)            e v icGet.a icStx
(xLDY.a e)              e icLd.a icLdx
(xSTY.a e v)            e v icLd.a icStx
(xFUN.a.v e)            icFun.k e icRet.m
                        where k = v-a+1, g = v+1
(xCALL.a e...)          e ... icCall.a
(xIF e1 e2 e3)
                        e1
                        icJf F
                        e2
                        icJmp E
                     F: e3
                     E:
(xWHILE e1 e2)   
                     L: e1

                        icJf E
                        e2
                        icJmp L
                     E:
    or better
                        icJmp I
                     L: e2
                     I: e1
                        icJt L
(xDO e1 e2 ...)         e1  e2


Here are the examples of generating S-code from N-code (nut object).  It uses the same program from the "parsing-example":

1  Simple function
(def sq x () (* x x ))        Nut program

(fun.1.1 (* get.1 get.1 ))    N-code

1 fun.1.1                     S-code
2 get.1
3 get.1
4 mul
5 ret

2   Simple assignment
(def assign () (a b) (set a (+ b 1)))

(fun.0.2 (put.1 (+ get.2 lit.1 )))

1 fun.0.2
2 get.2
3 lit.1
4 add
5 put.1
6 ret

3  Control flow
(def parseControl () (i j k)
  (do
    (while (< i 10)
       (set i (+ i 1)))
    (if (= j 2)
       (set k 20)
       ; else
       (set k 10))))

(fun.0.3 (do (while (< get.1 lit.10 )
(put.1 (+ get.1 lit.1 )))
(if (= get.2 lit.2 )
(put.3 lit.20 )(put.3 lit.10 ))))

1 fun.0.3
2 get.1
3 lit.10
4 lt
5 jf.11 <out>
6 get.1
7 lit.1
8 add
9 put.1
10 jmp.2 <loop>
11 get.2
12 lit.2
13 eq
14 jf.18 <else>
15 lit.20
16 put.3
17 jmp.20 <exit>
18 lit.10
19 put.3
20 ret

4  Global variables and indexing
(let arrayA g)
(def simple () (a b)
  (do
    (set g 1000)              ; global
    (set a g)
    (set b 11)
    (set arrayA (new 10))
    (setv arrayA 1 20)        ; arrayA[1] = 20
    (set b (vec arrayA 1))))  ; b = arrayA[1]

arrayA    at 2000
g         at 2001
(fun.0.2 (do (st.1 lit.1000 )
(put.1 ld.1 )(put.2 lit.11 )
(st.0 (new lit.10 ))
(sty.0 lit.1 lit.20 )
(put.2 (ldy.0 lit.1 ))))

1 fun.0.2
2 lit.1000
3 st.1
4 ld.2001   ; ld.1  load g
5 put.1
6 lit.11
7 put.2
8 lit.10

9 array
10 st.2000
11 lit.1
12 lit.20
13 ld.2000
14 stx      ; sty.2000 lit.1 lit.20
15 ld.2000
16 lit.1
17 ldx      ; ldy.2000 lit.1
18 put.2
19 ret


End
last update 30 June 2009