Beginning
with Parse Tree
Abstract Interpretation
Interpreter (take a parse tree and "run" it)
Machine Abstraction
Machine Instructions (what kind of machine to "run" the
abstract program)
Code Generation
Parse Tree
data struction: list consists
of operator and operand*
Operator
binary, unary, n-ary, function call
Operand
constant, local variable, global variable (include vector)
Illustrative example
from the source language a = b + c
an output parse tree looks like:
= / \ a + / \ b c
Assume we use a list to represent a tree, the first element is an operator,
the rest is the operands.
(= a (+ b c ))
This data structure is built from a number of cells. A cell contains two
fields, head and tail. There are two types of cell: atom and
dot-pair. An atom is the end of a branch (a kind of leaf node).
A dot-pair is a pointer to another list. The above example can be draw
like this:
The cell that pointed to = and a are atoms. The cell pointed to
the list (+ b c) is a dot-pair. How to distinguish a cell and a
dot-pair depends on the choice of implementation of the data structure.
To help you to understand this structure a bit more, we will write a program
to copy a parse tree.
where isatom() checks whether a is an atom. newatom() creates a new
atom. head() extracts the head field of a cell. tail() extracts
the tail field of a cell. cons() is a constructor, it creates a new dot-pair
with the head and tail instantiated by the first and second arguments.
Interpreter
virtual machine
traverse a parse tree and execute the operator. The function
go() traverses the parse tree.
go(e) if (e == nil) stop if isatom(e) do_atom(e) // then e is a
list (operator operand*) a = head(e) b = tail(e) switch typeof(a) PLUS: go(head(b))
+ go(second(b)) MINUS: ...
Machine
stack-based (First In First Out, FIFO)
the main runtime data structure to support function call is
"activation record"
activation record stores the state of computation: Instruction
pointer, Old activation record, Local variables (registers)
Machine Instructions
S-code
is a stack-based instruction set. It is somewhat different from
common processor instruction (typify by Intel instruction set).
While a common instruction set today is a three-address instruction,
S-code is zero-address instruction. The main working storage for
S-code is a stack. It is called "evaluation stack" and is
comparable to the registers in a processor. A stack has two operations
on its data: push and pop. So, it is not necessary to "address"
the stack (hence the name "zero-address").
For
example, to perform an add operation, two arguments are pushed to the
stack and the operator "add" is executed. The operator "add" takes
two arguments from the stack, adding them and pushes the result back. To
store local variables, S-code employs a data structure called
"activation record". It is a structure that exists only when a function
is called and it is deleted when a function is terminated (that is why
it is a "local" data). An evaluation stack is "global", it exists
from the beginning of the execution of a program until the program
terminates.