n-code to s-code generator
The s-code virtual
machine
S-code is a stack-based instruction set. It is based on
the intermediate language for Som language. The virtual machine
used in Som v 2.0. This version contains only the s-code
VM. The Som compiler itself is in the form of s-code (som.obj)
which we do not use.
Explain s-code (from
som/report/s-code.htm)
Mapping from n-code to
s-code
xADD
e1 e2 e1 e2 icAdd
xLIT.a
icLit.a
xGET.a
icGet.a
xPUT.a
icPut.a
xLD.a
icLd.a
xST.a e
e icSt.a
xLDX.a e
e icGet.a icLdx
xSTX.a e v
e v icGet.a icStx
xLDY.a e
e icLd.a icLdx
xSTY.a e v
e v icLd.a icStx
xFUN.a.v e
icFun.d e icRet.g d =
v-a+1, g = v+1
xCALL.a
icCall.a
xIF e1 e2 e3
e1
icJf <F>
e2
icJmp <E>
F: e3
E:
xWHILE e1 e2
L: e1
icJf <E>
e2
icJmp <L>
E:
or better
icJmp <I>
L: e2
I: e1
icJt <L>
xDO e1 e2 ...
e1
e2
format of s-code
(som-v2) object file
magic
start end (end
inclusive)
code*
(code segment)
start end
data*
(data segment)
for som-v2 som-in-som magic = 5678920
The nut-som code
generator: n-code to s-code generator (gen.txt)
Input of the code generator is an n-code object. Output is the
s-code object.
Let give one example
the source
(def
add1 x () (+ x 1))
(def main () ()
(sys 1 (add1 22)))
is compiled into n-code object
add1
(fun.1.1 (+ get.1 lit.1 ))
main
(fun.0.0 (sys.1 (call.80 lit.22
)))
22 22
2 1 16 1 0
4 1 14 1 2
6 1 6 0 4
8 0 0 6 0
10 1 19 257 8
12 1 16 22 0
14 1 13 10 12
16 0 0 14 0
18 1 20 1 16
20 0 0 18 0
22 1 19 0 20
0
The s-code generator takes this n-code object and outputs s-code object.
5678920
1 12
2080 23 294 280 287 1 532 294
5663 800 292 276
1000 999
Which means the following:
1
Call 8
2
End
3
Fun 1
4
Get 1
5
Lit 1
6
Add
7
Ret 2
8
Fun 1
9
Lit 22
10 Call 3
11 Sys 1
12 Ret 1
Example session
The above example comes from the following sequence of commands:
compile the nut-compiler:
c:>nut31
< nut.txt
and use nut-compiler to compile the example, "t3.txt", the output goes
to "t3.obj":
c:>nsim31
< t3.txt > t3.obj
edit t3.obj to get rid of the listing at the begining. Now
compile the s-code generator, "gen.txt":
c:>nut31
< gen.txt
and use it to generate the final s-code object:
c:>nsim31
< t3.obj > t3s.obj
We can use the s-code virtual machine, somv2.exe to run it and to
generate a "readable" s-code.
to generate a readable s-code from a s-code object:
c:>somv2
-l < t3s.obj
to run it:
c:>somv2
-x < t3s.obj
It is the "pure" s-code virtual machine. This VM is exactly as
appear on the Som website. I did some editing of the C source so
that somv2 can output the s-code listing. If you want to see the
s-code virtual machine, you can find it <here>.
Because of the text editing, this version is 100 lines shorter than the
original version posted, however the meaning is exactly the same.
som-v2.zip (the edited
version)
How the code
generator work?
It uses "eval". In other words, the generator reads the input
n-code object and traverses the n-code. Instead of "executing"
it, the generator outputs the corresponding s-code. The mapping
between n-code and s-code is simple. Most of the code is
one-to-one mapping. However, the "address" is different.
This is handled using the "associative" list of n-code address to
s-code address.
eval
; s-code generator
...
DO
while e1
eval
head e1
e1 =
tail e1
ADD
eval head e1
eval arg2 e1
out icAdd
LIT
out icLit arg
GET
out icGet arg
FUN
insertLable ads
XP ; ads is n-code address
; XP is s-code address
lv = arg & 255
arity = arg >>
8 ; decode a.v
out icFun
(lv-arity+1)
eval head e1
out icRet (lv+1)
CALL
while
e1
; generate all arguments
eval head e1
e1
= tail e1
out icCall (assoc
arg) ; map address to s-code
The associative list has two operations:
insertLable
n s
insert (n-ads, s-ads)
assoc
n
output s-ads associated
with n-ads, 0 if no match
For the control flow, n-code has the next link but s-code is a linear
code so it requires the "jump" instruction and explicit addresses.
IF e = (cond true false)
eval head
e ; gen cond
out icJf
0 ;
<1>
ads = XP -
1 ; mark s-code
ads
eval arg2
e ; gen true
if (arg3 e) =
NIL
patch ads (XP-ads) ; patch jf at <1>
else
out icJmp 0 ; <2>
patch ads (XP-ads)
ads = XP - 1 ; mark s-code ads
eval arg3 e ; gen false
patch ads (XP-ads) ; path jmp at <2>
As all jumps in s-code are relative addressing, the only instruction
that need to relocate its argument is "call". The "xSTR" is also
relative to data segment hence needs not to be relocated.
29 June 2006