som v5  t-code design

The aim is to release "the fastest vm" (tx-code style, v19 series) to public. In particular, this release will reconsider t-code and put effort to write a fairly clean code.  I think because of 3-address t-code style, the noi will be smallest amongst all previous som releases.  Because running time is directly varied with noi, it will also be "the fastest som" too.

The date of release will be for this coming King's birthday, 5 December 2552 (2009).  I will emphasis on the correct code with ample measurement on efficiency (plenty of benchmark and profile running time etc.).  This will be the basis of Som in 2010 (including future planned changes to the language).

T-code design

I have two main designs: 

1) one with no restriction on addressing mode, so no distinction between local/global/immediate.  The instruction format is large (3 big addresses, upto 128 bits).  The call instruction needs to save/restore locals.  It also will have the smallest noi.

2) the other makes distinction between local and global/immediate.  It has several modes: r = r op m/i, r = r op r etc.  Because locals are distinct, a stack frame can be used to create new set of locals.  It will not need to save/restore locals. Global and immediate can be the same.  The instruction size will be 64 bits.

There are many variations between these two designs (see som v5 t-code design in hand-writing, 6 oct 2009).

Design 1 is simpler and easier to make a correct code (code generator is simple).  The size of code segment is not the main concern. The "call" will be a bit slower (than alternative design).  I will leave more complex optimisation (design 2) for a future release.

Instruction format

to have 3 big addresses, this format is necessary:  24:8, 32, 32  (d:op,a,b)
It is not small but not the largest.  The "decoding" is not excessive.

(the instruction set is improved from v19v)

bop:      add sub mul div mod and or xor shl shr eq ne lt le gt ge
control:  jmp jt jf jeq jne jlt jle jgt jge fun call callt ret efor case
data:     ldx stx mov push push2
etc:      not sys
(total 38)

bop d a b     ==   M[d] = M[a] bop M[b]
jmp d         ==   goto d
jt d a        ==   if a != FALSE goto d
jf d a        ==   if a == FALSE goto d
jxx d a b     ==   if a op b goto d
ldx d a b     ==   M[d] = M[ M[a] + M[b] ]
stx d a b     ==   M[ M[a] + M[b] ] = M[d]
call d a ads  ==
                  get arity and numlocal (framesize)
                  1. move parameters to temp (param: d,a and from stack)
                  2. save locals (numlocal), pc,sp,fp  (at stack)
                  3. update fp, sp
                  4. move temp to locals (arity)
ret d fs      ==
                  1. move return value d to M[retval]
                  2. restore locals , pc,sp,fp
callt d a ads ==
                  1. move parameters to temp
                  2. move temp to locals
efor d a b    ==  M[a]++; if M[a] <= M[b] goto d
case d lo hi  ==  
                  jump table is set of displacements
                  size of table is hi-lo+1
                  if lo <= d <= hi goto entry[lo-d+1]
                     else          goto hi-lo+1+1
                  where current pc is at the "case"
not d a       ==  M[d] = not M[a]
sys d a b     ==  syscall d, optional param: a,b
push d        ==  sp++; M[sp] = M[d]
push2 d a     ==  sp++; M[sp] = M[d]; sp++; M[sp] = M[a]

others

mov d a       ==  M[d] = M[a]  equiv. to  add d a #0
set&jmp d a b ==  M[d] = M[a]; goto b.  use when enter for-loop
fun d a       ==  it stores arity (d), and numlocal (a)
                  can be encoded into 24 bits 

comments

call uses 'b' as address to get 32 bits (other control use 'd').  Some instructions are redundant: gt/ge (reverse arg of le/lt) but it makes code generation a bit more complicate.  Some inst. are equiv.  
  mov == add 0 
  jmp == jf 0
but distinct opcode will make analysis easier. set&jmp is used when entering for-loop (set initial index value and goto efor).  It is also useful in other situation such as while-loop.

To simplify code generation of "case" because the "else" goes to the end of jump table, putting "jmp .." there. The jump table makes "case" inst. a strange object of variable length.  If "jmp" is used instead the size will be quite large (3 words per entry).

size of instructions  (32-bit word)

bop  3     call  3    push   1
jmp  1     ret   2    push2  2
jt   2     efor  3    mov    2
jf   2     setj  3    fun    1
jxx  3     case  3      
ldx  3     not   2
stx  3     sys   3    

To save space in code segment, the size of instruction can be variable (without slowing down the execution).

A lot of discussion on instruction decoding and constants as globals can be found in "som\som-v19\doc\som-v19v-explain.txt".

6 Oct 2009

PS  som-v19v  has the following instructions (gleen from interp.c)
  ldx stx jmp jt jf jeq jne jlt jle jgt jge
  call callt ret call1 callt1
  add sub mul div and or xor shl shr eq ne lt le gt ge
  not mov ads pass efor case sys end

end

activation record in vm v5

hi

         <- sp
 .. param
--------
 fp'     <- fp
 retads
  lv     v_n
  ..
  pv     v_1
--------
         <- sp'
lo

current frame stores: fp', retads
no need to store sp as it tracks fp when ret
pv+lv  is saved registers (nlocal)

parameter passing

two parameters can be passed via "call" instruction. The extra are pushed to stack (via SS[sp]).

inside v5-vm, an array param[.] is used to collect actual parameters to be instantiated to registers.

step of call/ret

call

1.  pass parameters to param[.]
    1.1  the first two are collected from "call"
    1.2  the rest are collected from stack
2.  save registers (M[1]..M[nlocal]) to next 
    activation record.
3.  instantiate registers from param[.]
4.  update sp, fp

ret

1.  return value via M[retval]
2.  restore registers from the current act. rec.
3.  restore sp, fp

push2 is not implemented as there are already two parameters passing via "call". push2 will be rarely used.

15 Oct 2009


