U-code
This is for som 4.0 virtual machine. The aim
- compact instruction set
- fast
- clean semantic
Toward these goals, the instruction set is:
- less than 50 instructions
- one-addressing with a few two-addressing
- fully decoded format
- accumulator based
The instruction set
bop
:
add
sub mul div band bor bxor mod
eq
ne
lt le gt ge
bim :
addi subi shli shri
data : ld st
get put lit ads
vector : ldx/ stx/ ldy/
sty/
control: jmp jt jf jle/
case/ call callt ret
extra : fun/ sys inc
dec not push pusha
total 43 instructions
(but fun is not really an instruction. It is a marker) The
two-address instructions are marked with /. To stop the
execution, use "sys 13"
instead of "end".
instruction encoding
1
add sub mul div band bor bxor mod eq
10 ne lt le gt ge addi subi shli
shri inc
20 dec lit ads sys not push pusha
get put ld
30 st call ret callt jmp jt jf
jle case ldx
40 stx ldy sty fun
Format
one-address
op:32 arg:32
two-address
op:32 a:24,b:8
Semantic
v
is M[fp+v] Note the positive order of local
variables
bop
v
:: AC = AC op v
bim a
:: AC = AC op a
data
lit
a :: AC = a
ads a :: AC =
a use to mark a as an address
(pointer)
ld a :: AC = M[a]
st a :: M[a] = AC
get v :: AC = v
put v :: v = AC
vector
ldy a,v :: AC
=
M[M[a]+v]
sty a,v ::
M[M[a]+v]
= AC
ldx v1,v2 :: AC =
M[v1+v2]
stx v1,v2 :: M[v1+v2]
= AC
control
jt a
:: if AC != 0 pc = a
jf a
:: if AC == 0 pc = a
jle a,v :: if
AC
<= v pc = a
call a :: if
arity(a) > 0 pusha passing the last parameter if any
create a new activation record
ret size ::
delete current activation record
case lo,v :: if AC
>= v >= lo skip 2+2*(v-lo)
extra
inc
v :: v++, AC = v
dec
v :: v--, AC = v
push v
:: sp++, M[sp] = v
pusha
::
sp++,
M[sp] = AC
fun arity,size a
place holder for arity,size
usage
for loop
...
inc i
jle loop,v
case
lit
hi
case lo,v
jmp else
jmp case1
...
jmp casen
comment
Two-addressing alleviates the need for Base register in vector
instructions. "jle"
is the variation of "efor".
It is
simpler but need a modified "inc v"
to work. It does not required to
"decrement" the initial index and there is no "hidden" adjacent
variable. "case" is a long
process of refinement. I think I got a
good compromise here, using AC for "hi" value and keep "jmp"
instruction in the jump table. This design trades off the size of
the code for simplicity in semantic.
Activation record
U-code stack frame has "positive" order of local variables (v is
M[fp+v] ) unlike
S-code. The "usual" (s-code style) structure of
activation record is as follows:
hi
... <-
sp
retads
fp' <-
fp
v1
...
vn
lo
v is M[fp-v]. This
backward order required renaming of local
variables. The rename process scans the code after the code body
is completely generated as there may be some additional local variable
allocated during the code generation. If the order is forward then
renaming is not necessary.
hi
... <-
sp
retads
fp'
vn
...
v1
<-
fp
lo
Note that the order of local variables is from 1 to n (instead of the
reverse n to 1 in S-code). To know where fp' and retads are, the
size of the activation record
must be known. It is record as the argument of the "ret"
instruction. To create a new activation record, two arguments are
used: arity,
and the number of local variables. They are recorded as two arguments
of the "fun" instruction.
Code generation for
an accumulator machine
Without a stack, a temporary register must be allocated to store the
intermediate result. For example
a = b + c - d
Going from left to right, b + c must be stored in t, then t - d and put
a. This is done in genbop.
Parameter passing
Passing parameters to a function requires special treatment. The space
in SS[] (where it stores the activation records), is used as a "virtual
stack" to pass parameters to a new frame. The new instructions
are created for this task: push, pusha. With pushing parameters
to a virtual stack, a tail-call instruction (callt) is revived as it is
appropriate. It is far simpler than trying to generate codes to
pass parameters back to the old frame and do a jump. Callt is
faster too (it is another "big" instruction according to our philosophy
of trying to create big instruction, a "big" instruction does more in
one instruction). The last parameter is passed through AC,
occassionally
saving one push instruction.
last update 24 June 2010