U-code improved

This is a gentle refinement of u-code in som v.4.0. A few more immediate mode instructions are added. Two load index instructions with argument on AC are added. A bit of change of "case" format to one argument. Read the original u-code specification here. The improved u-code has 50 instructions.

The instruction set

bop    : add sub mul div band bor bxor mod
          eq ne lt le gt ge shl shr
bim    : addi subi bandi bori eqi lti lei shli shri
data   : ld st get put lit
vector : ldx/ stx/ ldy/ sty/ ldxa ldya
control: jmp jt jf jle/ case call callt ret
extra : fun/ sys inc dec not push

Total 50 instructions. The additional instructions are : shl shr bandi bori eqi lti lei ldxa ldya. The suffix / indicates the instruction with two arguments.

The ldxa ldya are added because it can shorten "putting" the result into a temporary register (called cascading). Not all the immediate mode is added. That will make the instruction set too fat. The logical one: nei, gti, gei can be emulate by the inverse. The others are rarely used: muli divi xori modi. shl and shr are added to make the instruction set complete. They have been omitted from the original u-code. ads is retired. It is used only to relocate the data segment. pusha is removed, it can be replaced by push 0 where 0 signifies that AC is the argument.

Format

one-address op:32 arg:32
two-address op:32 a:24,b:8

Semantic

v is M[fp+v]

bop v      :: AC = AC op v
bim a    :: AC = AC op a

data

lit a      :: AC = a
ld a       :: AC = M[a]
st a :: M[a] = AC
get v    :: AC = v
put v    :: v = AC

vector

ldy a,v    :: AC = M[M[a]+v]
sty a,v    :: M[M[a]+v] = AC
ldx v1,v2 :: AC = M[v1+v2]
stx v1,v2 :: M[v1+v2] = AC
ldxa v :: AC = M[v+AC]
ldya a     :: AC = M[M[a]+AC]

control

jt a     :: if AC != 0 pc = a
jf a     :: if AC == 0 pc = a
jle a,v    :: if AC <= v pc = a
call a     :: if arity(a) > 0 pusha passing the last parameter if any
               create a new activation record
ret size   :: delete current activation record
case lo; lit hi :: if lo <= AC <= hi skip 4+2*(AC-lo)

extra

inc v      :: v++, AC = v
dec v      :: v--, AC = v
push v     :: sp++, if(v == 0) M[sp] = AC else M[sp] = v
fun arity,size    a place holder for arity,size

usage

for loop

...
inc i
jle v loop

case

<index in AC>
case lo
lit hi
jmp else
jmp case1
...
jmp casen

u-code encoding

1 add sub mul div band bor bxor mod eq
10 ne lt le gt ge shl shr addi subi bandi
20 bori eqi lti lei shli shri inc dec lit sys
30 not push get put ld st ldxa ldya call ret
40 callt jmp jt jf case ldx/ stx/ ldy/ sty/ jle/
50 fun/

Comment

Upon analysing v4 vs v31 compiler (see som-v4-bm/doc/v4-vs-v31-compile.txt) many possible improvements to u-code come to mind. The first is the immediate mode. The second is to use "cascade" AC to reduce "put" (using AC as argument in some instruction). However, adding everything will make u-code unattractively large. The aim to include more instruction into u-code is to improve the performance without undue increase in complexity to the instruction set.

For "case", the previous form is correct in sense of being one instruction without any continuation so its semantic is clear. VM does not behave differently from executing other instruction. VM does not fetch any argument from the next instruction. The proposed "case":

case lo, lit hi, jump ...

is faster but its semantic is not consistent with the rest of the instruction set. It fetches the next instruction to be used in deciding the transfer of control. So, the question is, what do we prefer, clean semantic or performance?

last update 24 June 2010