Parsing Nut program
to N-code
First, the parser converts the input source program into an
intermediate form, called intermediate code. In
Nut, the intermediate code is called N-code. The structure of
N-code resembles
to the source program. One important "rule of thumb" for a
compiler is that what that can be done at compile time will be done
to save the effort at the run-time (to make running a program as fast
as possible). This is a reasonable
"strategy" because some program, after it has been compiled, will be
used many many times.
N-code is explained in details in my textbook chapter
2, pages 41-45. Here is the summary of its instruction set:
-----------------------
N-code instruction set
N-code instruction set is a definition for the internal representation
of Nut language. The instruction follows from the Nut language pluses
some extra
instructions to implement precise operational semantic of Nut language.
The instruction set is divided into four groups: control, value,
arithmetic and system. Each instruction has the form of an atom with
7-bit opcode and 24-bit argument.
opcode
encoding
xIF 1 xWHILE 2
xDO 3 --
xNEW 5
xADD 6 xSUB
7 xMUL 8 xDIV 9 xEQ 10
xLT 11 xGT
12 xCALL 13 xGET 14 xPUT 15
xLIT 16 xLDX 17 xSTX
18 xFUN 19 xSYS 20
--
--
--
-- xLD 25
xST 26 xLDY 27
xSTY 28 -- --
--
xSTR 32 xBAND 33 xSHR 34 xSHL 35
Totally there are 27 instructions in N-code instruction set. Only
valueinstructions have arguments, denoted by “op.arg”. “fun” has
special arguments (to be explained later). “call” has a pointer to its
body of a function (the N-code) as its argument.
-----------------------
Note: Please note that N-code don't have <= >= and or not.
To use these operators, they must be defined:
(def
!=
(a b) () (if (= a b) 0 1))
(def >= (a b) () (if (< a
b) 0 1))
(def <= (a b) () (if (> a
b) 0 1))
(def and (a b) ()(if a b 0))
(def or (a b) () (if a 1 b))
(def not a () (if a 0 1))
xFUN has two arguments encoded into its argument field: fun.a.s
where a is the arity of the function, s is the size of its activation
record, in terms of the number of local variables of this function,
that includes its arguments
and its local variables.
Here are the examples of parsing some Nut programs into N-code.
1) simple function definition. source code in file "sq.txt"
(def
sq
x () (* x x ))
e:>nut32
< sq.txt
sq
(fun.1.1 (*
get.1 get.1 ))
2) assignment statement. source code in file "assign.txt"
(def
assign () (a b) (set a (+ b
1)))
e:>nut32
< assign.txt
assign
(fun.0.2
(put.1 (+ get.2 lit.1 )))
3) control statement. source code in file "control.txt"
(def
parseControl () (i j k)
(do
(while (< i
10)
(set
i (+ i 1)))
(if (= j 2)
(set
k 20)
;
else
(set
k 10))))
e:>nut32
< control.txt
parseControl
(fun.0.3 (do
(while (<
get.1 lit.10 )(put.1
(+ get.1 lit.1 )))
(if (= get.2
lit.2)(put.3 lit.20
)(put.3 lit.10 ))))
4) global variable declaration. source code in file "global.txt"
(let
arrayA g)
(def simple () (a b)
(do
(set g
1000)
;
global
(set a g)
(set b 11)
(set arrayA
(new 10))
(setv arrayA 1
20) ; arrayA[1] =
20
(set b (vec
arrayA 1)))) ; b = arrayA[1]
e:>nut32
< global.txt
arrayA
g
simple
(fun.0.2 (do
(st.1 lit.1000 )
(put.1 ld.1
)(put.2 lit.11 )
(st.0 (new
lit.10 ))
(sty.0 lit.1
lit.20 )
(put.2 (ldy.0
lit.1 ))))
Note: local variable names are changed into 1..n and global variable
names are changed to a static address (0..m) by the compiler. The
actual address will be determined when the real machine code is
generated.
last update 23 June 2010