Build your own Nut compiler

to compile nut-compiler (nut.txt) using nut32-compiler (my compiler written similar to nut.txt but in C language so that it can be executed on PC without using other virtual machines).

what do we expect?

    source -> compiler -> output

the source input is "nut.txt".
the output is (shown on the screen) an object file (which should be executabled by nut-virtual machine).  The object file includes two sections: first is the machine code, second is the symbol table.

Let's try it!  compile the nut-compiler

> nut32 < nut.txt

you can see the following output on your console.

-------------------
!=
(fun.2.2 (if (= get.1 get.2 )lit.0 lit.1 ))
>=
(fun.2.2 (if (< get.1 get.2 )lit.0 lit.1 ))
and
(fun.2.2 (if get.1 get.2 lit.0 ))
or
(fun.2.2 (if get.1 lit.1 get.2 ))
not
(fun.1.1 (if get.1 lit.0 lit.1 ))
nop
...

main
(fun.0.1 (do (call.87 )(call.134 )(st.5 (sys.9 ))(sys.11 )(call.157 )(call.167 )(put.1 (sys.9 ))(call.169 ld.5 get.1 )))
-------------------

compare it to the source (nut.txt) and see if you recognise any correspondence between source code and machine code output.  Remember that the machine code output as shown is the intermediate data structure.  This so called "intermediate code" is subsequently processed by a code generator to produce the output machine code (that can be executed by nut-virtual machine).

You can redirect the output from the console into a file to inspect it.

> nut32 < nut.txt > nut-out.txt

and look into "nut-out.txt" file.

Now try the run the output object file.  What is it?  It is a nut-compiler!  But instead of running it directly on a PC we have to use a nut-virtual machine to run it because this machine code is not Intel instruction set. Let's call this compiler a new-nut-compiler.

The object file is named by default as "a.obj".  Use this compiler to compile some simple program (from our previous excercise, "test-tok.txt").

Compile test-tok.txt using new-nut-compiler.

> nsim32 a.obj < test-tok.txt

The new-nut-compiler does not produce any output file.  All its output is on the console (output to screen).  What it is outputting?  An object file, of course, of the source input.  Let's see:

--------------
176 176
2 1 14 1 0
4 1 14 2 2
6 1 10 0 4
8 1 16 1 0
10 1 16 0 8
12 0 0 6 10
14 1 1 0 12
16 0 0 14 0
18 1 19 514 16
20 1 16 0 0
22 0 0 20 0
24 1 19 0 22
...
tok 8 0 0 0
tokenise 3 118 0 0
testtok 3 162 0 0
main 3 176 0 0
-------------------

That's it!  Now you have completed the build-your-own-compiler excercise.  Have fun.

Final thought, do you think that this object file (the output on console screen) is executable by nut-virtual machine?  I challenge you to try to run it.  Remember what test-tok.txt do.  It scans an input file (nut source) and outputs the tokens (lexemes) one-by-one. 

(I have not try this last trick so be on your own.  Be brave and experiment).

Prabhas Chongstitvatana
23 June 2009