Beginning
with Parse Tree
Abstract Interpretation
Interpreter (take a parse tree and "run" it)
Machine Abstraction
Machine Instructions (what kind of machine to "run" the
abstract program)
Code Generation
Parse Tree
Data structure: list consists of
operator and operand*
Operator
binary, unary, n-ary, function call
Operand
constant, local variable, global variable (include vector)
Illustrative example
from the source language a = b + c
an output parse tree looks like:
= / \ a + / \ b c
Assume we use a list to represent a tree, the first element is an operator,
the rest is the operands.
(= a (+ b c ))
This data structure is built from a number of cells. A cell contains two
fields, head and tail. There are two types of cell: atom and
dot-pair. An atom is the end of a branch (a kind of leaf node).
A dot-pair is a pointer to another list. The above example can be draw
like this:
The cell that pointed to = and a are atoms. The cell pointed to
the list (+ b c) is a dot-pair. How to distinguish a cell and a
dot-pair depends on the choice of implementation of the data structure.
To help you to understand this structure a bit more, we will write a program
to copy a parse tree.
where isatom() checks whether a is an atom. newatom() creates a new
atom. head() extracts the head field of a cell. tail() extracts
the tail field of a cell. cons() is a constructor, it creates a new dot-pair
with the head and tail instantiated by the first and second arguments.
Interpreter
virtual machine
traverse a parse tree and execute the operator. The function
eval() traverses the parse tree (evaluate).
eval(e) if (e == nil) stop if isatom(e) do_atom(e) // then e is a
list (operator operand*) a = head(e) b = tail(e) switch typeof(a) PLUS:
eval(head(b)) + eval(second(b)) MINUS: ...
Machine Code
The main runtime data structure to support function call is
"activation record". Activation record stores the state of
computation: Instruction pointer, Old activation record, Local
variables (registers). Frame pointer (FP) stores a pointer to current
activation record. Stack pointer (SP) stores the pointer to the current
stack.
Machine Instructions
S2
is a typical instruction set in modern processors. It has three-address
instruction format.
Example of compiling a source to S2 instructions
source (Rz 3.6)
// factorial
fac(n)
if( n == 0 ) return 1
else return n * fac(n-1)
main()
print(fac(7))
Run the compiler: (output parse tree and assembly code)
C:\rz36-3\test>rz36 fac.txt
Parse tree
(fun fac (else (== #1 0 )(return 1 )(return (* #1 (call fac (- #1 1
))))))
(fun main (print (call fac 7 )))
Assembly of S2 .symbol
...
.code 0
mov fp #3500
mov sp #3000
jal rads main
trap r0 #0
:fac
st r1 @1 fp
...
pop sp r1
eq r2 r1 #0
jf r2 L102
mov retval #1
jmp L101
:L102
sub r3 r1 #1
push sp r3
jal rads fac
mul r2 r1 r28
mov retval r2
jmp L101
:L101
...
ld r1 @1 fp
ret rads
:main
st r1 @1 fp
...
mov r1 #7
push sp r1
jal rads fac
trap r28 #1
...
ld r1 @1 fp
ret rads