Code Generation
< read my textbook Chapter 4>
First, the parser converts the input source program into an
intermediate form, called intermediate code. Then the code
generator takes this intermediate code and generates a machine
code. For our purpose, we define S-code as the target intruction
set.
S-code
S-code is a stack-based instruction set. It is somewhat different
from common processor instruction (typify by Intel instruction
set). While a common
instruction set today is a three-address instruction, S-code is
zero-address instruction. S-code is chosen because it is
well-matched to N-code which simplify the task of code generation.The
main working storage for S-code is a stack. It is called
"evaluation stack" and is comparable to the registers in a
processor. A stack has two operations on its data: push and pop.
So, it is not necessary to "address" the stack (hence the name
"zero-address").
For example, to perform an add operation, two arguments are pushed to
the stack and the operator "add" is executed. The operator "add"
takes two
arguments from the stack, adding them and pushes the result back. To
store local variables, S-code employs a data structure called
"activation record".
It is a structure that exists only when a function is called and it is
deleted when a function is terminated (that is why it is a "local"
data). An evaluation stack is "global", it exists from the
beginning of the execution of a program until the program terminates.
Read detailed description of S-code here: http://www.cp.eng.chula.ac.th/faculty/pjw/project/som/s-code.htm
Code generation scheme
the input is N-code (object file of the parser)
the output is S-code (object file display on the screen)
To "run" the S-code, we need to have a virtual machine (Som virtual
machine) or an actual processor (SX chip). We will not concern
how to "execute" S-code at the moment.
Table
4.1 Mapping between N-code and S-code
n-code
s-code
xLIT.a
icLit.a
xGET.a
icGet.a
xPUT.a
icPut.a
xLD.a
icLd.a
(xADD e1 e2
e1 e2 icAdd
(xST.a e)
e icSt.a
(xLDX.a e)
e icGet.a icLdx
(xSTX.a e
v) e v
icGet.a icStx
(xLDY.a e)
e icLd.a icLdx
(xSTY.a e
v) e v
icLd.a icStx
(xFUN.a.v
e) icFun.k
e icRet.m
where k = v-a+1, g = v+1
(xCALL.a
e...) e ... icCall.a
(xIF e1 e2 e3)
e1
icJf F
e2
icJmp E
F: e3
E:
(xWHILE e1 e2)
L: e1
icJf E
e2
icJmp L
E:
or better
icJmp I
L: e2
I: e1
icJt L
(xDO e1 e2 ...)
e1 e2
Here are the examples of generating S-code from N-code (nut
object). It uses the same program from the "parsing-example":
1 Simple function
(def
sq x () (* x x )) Nut program
(fun.1.1 (* get.1 get.1
)) N-code
1
fun.1.1
S-code
2 get.1
3 get.1
4 mul
5 ret
2 Simple assignment
(def
assign () (a b) (set a (+ b 1)))
(fun.0.2 (put.1 (+ get.2 lit.1 )))
1 fun.0.2
2 get.2
3 lit.1
4 add
5 put.1
6 ret
3 Control flow
(def
parseControl () (i j k)
(do
(while (< i
10)
(set i (+ i 1)))
(if (= j 2)
(set k 20)
; else
(set k 10))))
(fun.0.3 (do (while (< get.1
lit.10 )
(put.1 (+ get.1 lit.1 )))
(if (= get.2 lit.2 )
(put.3 lit.20 )(put.3 lit.10 ))))
1 fun.0.3
2 get.1
3 lit.10
4 lt
5 jf.11 <out>
6 get.1
7 lit.1
8 add
9 put.1
10 jmp.2 <loop>
11 get.2
12 lit.2
13 eq
14 jf.18 <else>
15 lit.20
16 put.3
17 jmp.20 <exit>
18 lit.10
19 put.3
20 ret
4 Global variables and indexing
(let
arrayA g)
(def simple () (a b)
(do
(set g
1000)
; global
(set a g)
(set b 11)
(set arrayA
(new 10))
(setv arrayA 1
20) ; arrayA[1] = 20
(set b (vec
arrayA 1)))) ; b = arrayA[1]
arrayA at 2000
g
at 2001
(fun.0.2 (do (st.1 lit.1000 )
(put.1 ld.1 )(put.2 lit.11 )
(st.0 (new lit.10 ))
(sty.0 lit.1 lit.20 )
(put.2 (ldy.0 lit.1 ))))
1 fun.0.2
2 lit.1000
3 st.1
4 ld.2001 ;
ld.1 load g
5 put.1
6 lit.11
7 put.2
8 lit.10
9 array
10 st.2000
11 lit.1
12 lit.20
13 ld.2000
14
stx ; sty.2000 lit.1 lit.20
15 ld.2000
16 lit.1
17
ldx ; ldy.2000 lit.1
18 put.2
19 ret
End
last update 30 June 2009