Design of t2-code

The first t-code is v17 used in som v 1.7.  v 1.8 is almost the same.

op:8 a1:8 a2:8 a3:8           a1 = a2 op a3
op:8 n:24
op:8 a1:8 d:16

bop:  add sub mul div mod and or xor eq ne lt le gt ge shl shr (16)
bopi: addi subi muli divi modi andi ori xori eqi nei lti lei gti gei shli shri (16)
other: not sys end
data: ldx stx mov movi movd pass passi lit ads
control: jmp jt jf call callt callf fun ret efor case

v17  total 54 instructions
v18  eliminate: movd

This instruction set makes distinction between local/global/immediate so it has two different sets of op: bop and bopi.  T2-code uses 32-bit address and does not make distinction between local/global/immediate (it does not have to because 32-bit is enough for all types and addressing is uniform).  It also combines jump and logical (conditional jump).

Instruction format

To have 3 big addresses, this format is necessary:  24:8, 32, 32 (d:op,a,b). It is not small but not the largest.  The decoding is not excessive.

bop:      add sub mul div mod and or xor shl shr eq ne lt le gt ge
control:  jmp jt jf jeq jne jlt jle jgt jge fun call callt ret efor case
data:     ldx stx mov push
etc:      not sys
        (total 37)

bop d a b     ==   M[d] = M[a] bop M[b]
jmp d         ==   goto d
jt d a        ==   if a != FALSE goto d
jf d a        ==   if a == FALSE goto d
jxx d a b     ==   if (a op b) != FALSE goto d
ldx d a b     ==   M[d] = M[ M[a] + M[b] ]
stx d a b     ==   M[ M[a] + M[b] ] = M[d]
call d a ads  ==
                  get arity and numlocal (framesize)
                  1. move parameters to temp (param: d,a and from stack)
                  2. save locals (numlocal), pc,fp  (at stack)
                  3. update fp, sp
                  4. move temp to locals (arity)
ret d fs      ==
                  1. move return value d to M[retval]
                  2. restore locals , pc,sp,fp
callt d a ads ==
                  1. move parameters to temp
                  2. move temp to locals
efor d a b    ==  M[a]++; if M[a] <= M[b] goto d
case d lo hi  ==  
                  jump table is set of displacements
                  size of table is hi-lo+1
                  if lo <= d <= hi goto entry[lo-d+1]
                     else          goto end of table
                  where current pc is at the "case"
not d a       ==  M[d] = not M[a]
sys d a b     ==  syscall d, optional param: a,b
push d        ==  sp++; M[sp] = M[d]
mov d a       ==  M[d] = M[a]  equiv. to  add d a #0
fun d a       ==  it stores arity (d), and numlocal (a)
                  can be encoded into 24 bits
optional

set&jmp d a b ==  M[d] = M[a]; goto b.  use when enter for-loop

Opcode encoding

0..9   nop add sub mul div mod and or xor eq
10..19 ne lt le gt ge shl shr not mov ldx
20..29 stx ud push call callt fun ret efor case jmp
30..38 jt jf jeq jne jlt jle jgt jge sys

Activation record in vm v5

hi

         <- sp
 .. param
--------
 fp'     <- fp
 retads
  lv     v_n
  ..
  pv     v_1
--------
         <- sp'
lo

The current frame stores: fp', retads. There is no need to store sp as it tracks fp when ret.  The pv+lv is saved registers (nlocal). Now that a local is in an absolute place (M[0]...M[256]), no renaming of registers during compilation is necessary.

Parameter passing

Two parameters can be passed via "call" instruction. The extras are pushed to stack (via SS[sp]). Inside v5-vm, an array param[.] is used to collect actual parameters to be instantiated to registers.

Constants as globals

To reduce the number of distinc instruction, the "immediate" mode can be eliminated using constants as globals.  It means all constants will be stored in globals.  For small constant, -10..300, it can be stored  permanently (immutable) in M[a]..M[b] so that the code generator can be simplified. There is a direct relation between the value and its address. For larger constants, they will be allocated and stored as globals using the symbol table (like enum).

M[a]..M[b] that are initialised by compiler are used to store small constants m..n.  For large constants (out of range m..n) allocate M[c] which can be retrieved by search symbol table.  This way, the access to symbol table will not be overwhelm as I expect 90% of constants will be small and not require symbol table to retrieve them during the compilation.

What should be m and n?  -1 is used often, 256 seems to be the largest small number (as least in our benchmark).  So the range -10..300 should cover most small numbers without being too many.  300 is used to signify a constant so a > 300.  Let a = 390, b will be 700.
  -10..300  is  M[390]..M[700]  or
  c is represented by M[400+c]

For large constants, DS is used to store them so they start at 1000 (the same as all other globals allocated by the compiler).

Comments

The call instruction uses 'b' as address to have 32 bits (other control use 'd').  Some instructions are redundant: gt/ge (reverse arg of le/lt) but it makes code generation a bit more complicate.  Some instruction are equivalent:  
  mov == add 0
 jmp == jf 0
However making them distinct will make analysis easier. set&jmp is used when entering for-loop (set initial index value and goto efor).  It is also useful in other situation such as while-loop.

To simplify code generation of "case" we put "jmp .." there, because the "else" goes to the end of jump table. The jump table makes "case" instruction a strange object of variable length in the code segment.  If "jmp" is used instead the size will be quite large (3 words per entry).

5 December 2009
Long live the King