code generator < read my textbook Chapter 4> First, the parser converts the input source program into an intermediate form, called intermediate code. Then the code generator takes this intermediate code and generates a machine code. For our purpose, we define S-code as the target intruction set. S-code S-code is a stack-based instruction set. It is somewhat different from common processor instruction (typify by Intel instruction set). While a common instruction set today is a three-address instruction, S-code is zero-address instruction. S-code is chosen because it is well-matched to N-code which simplify the task of code generation. The main working storage for S-code is a stack. It is called "evaluation stack" and is comparable to the registers in a processor. A stack has two operations on its data: push and pop. So, it is not necessary to "address" the stack (hence the name "zero-address"). For example, to perform an add operation, two arguments are pushed to the stack and the operator "add" is executed. The operator "add" takes two arguments from the stack, adding them and pushes the result back. To store local variables, S-code employs a data structure called "activation record". It is a structure that exists only when a function is called and it is deleted when a function is terminated (that is why it is a "local" data). An evaluation stack is "global", it exists from the beginning of the execution of a program until the program terminates. Read detailed description of S-code here: http://www.cp.eng.chula.ac.th/faculty/pjw/project/som/s-code.htm Code generation scheme the input is N-code (object file of the parser) the output is S-code (object file display on the screen) To "run" the S-code, we need to have a virtual machine (Som virtual machine) or an actual processor (SX chip). We will not concern how to "execute" S-code at the moment. Table 4.1 Mapping between N-code and S-code n-code s-code xLIT.a icLit.a xGET.a icGet.a xPUT.a icPut.a xLD.a icLd.a (xADD e1 e2 e1 e2 icAdd (xST.a e) e icSt.a (xLDX.a e) e icGet.a icLdx (xSTX.a e v) e v icGet.a icStx (xLDY.a e) e icLd.a icLdx (xSTY.a e v) e v icLd.a icStx (xFUN.a.v e) icFun.k e icRet.m where k = v-a+1, g = v+1 (xCALL.a e...) e ... icCall.a (xIF e1 e2 e3) e1 icJf F e2 icJmp E F: e3 E: (xWHILE e1 e2) L: e1 icJf E e2 icJmp L E: or better icJmp I L: e2 I: e1 icJt L (xDO e1 e2 ...) e1 e2 Here are the examples of generating S-code from N-code (nut object). It uses the same program from the "parsing-example": (def sq x () (* x x )) (fun.1.1 (* get.1 get.1 )) 1 fun.1.1 2 get.1 3 get.1 4 add 5 ret (def assign () (a b) (set a (+ b 1))) (fun.0.2 (put.1 (+ get.2 lit.1 ))) 1 fun.0.2 2 get.2 3 lit.1 4 add 5 put.1 6 ret (def parseControl () (i j k) (do (while (< i 10) (set i (+ i 1))) (if (= j 2) (set k 20) ; else (set k 10)))) (fun.0.3 (do (while (< get.1 lit.10 ) (put.1 (+ get.1 lit.1 ))) (if (= get.2 lit.2 ) (put.3 lit.20 )(put.3 lit.10 )))) 1 fun.0.3 2 get.1 3 lit.10 4 lt 5 jf.11 6 get.1 7 lit.1 8 add 9 put.1 10 jmp.2 11 get.2 12 lit.2 13 eq 14 jf.18 15 lit.20 16 put.3 17 jmp.20 18 lit.10 19 put.3 20 ret (let arrayA g) (def simple () (a b) (do (set g 1000) ; global (set a g) (set b 11) (set arrayA (new 10)) (setv arrayA 1 20) ; arrayA[1] = 20 (set b (vec arrayA 1)))) ; b = arrayA[1] ] in file "global.txt" e:>nut32 < global.txt arrayA AT 2000 g at 2001 (fun.0.2 (do (st.1 lit.1000 ) (put.1 ld.1 )(put.2 lit.11 ) (st.0 (new lit.10 )) (sty.0 lit.1 lit.20 ) (put.2 (ldy.0 lit.1 )))) 1 fun.0.2 2 lit.1000 3 st.1 4 ld.2001 ; ld.1 load g 5 put.1 6 lit.10 7 array 8 st.2000 9 lit.1 10 lit.20 11 ld.2000 12 stx ; sty.2000 lit.1 lit.20 13 ld.2000 14 lit.1 15 ldx ; ldy.2000 lit.1 16 put.2 17 ret