Design of t2-code
The first t-code is v17 used in som v 1.7. v 1.8 is almost the
same.
op:8
a1:8 a2:8
a3:8 a1 =
a2 op a3
op:8 n:24
op:8 a1:8 d:16
bop: add sub mul div mod
and or xor eq ne lt le gt ge shl shr (16)
bopi: addi subi muli divi modi
andi ori xori eqi nei lti lei gti gei shli shri (16)
other: not sys end
data: ldx stx mov movi movd pass
passi lit ads
control: jmp jt jf call callt
callf fun ret efor case
v17 total 54 instructions
v18 eliminate: movd
This instruction set makes distinction between local/global/immediate
so it has two different sets of op: bop and bopi. T2-code uses
32-bit address and does not make distinction between
local/global/immediate (it does not have to because 32-bit is enough
for all types and addressing is uniform). It also combines jump
and logical (conditional jump).
Instruction format
To have 3 big addresses, this format is necessary: 24:8, 32, 32 (d:op,a,b). It is
not small but not the largest. The decoding is not excessive.
bop:
add sub mul div mod and or xor shl shr eq ne lt le gt ge
control: jmp jt jf jeq jne
jlt jle jgt jge fun call callt ret efor case
data: ldx
stx mov push
etc:
not sys
(total 37)
bop d a b
== M[d] = M[a] bop M[b]
jmp
d == goto d
jt d
a == if a !=
FALSE goto d
jf d
a == if a ==
FALSE goto d
jxx d a b
== if (a op b) != FALSE goto d
ldx d a b
== M[d] = M[ M[a] + M[b] ]
stx d a b
== M[ M[a] + M[b] ] = M[d]
call d a ads ==
get arity and numlocal (framesize)
1. move parameters to temp (param: d,a and from stack)
2. save locals (numlocal), pc,fp (at stack)
3. update fp, sp
4. move temp to locals (arity)
ret d
fs ==
1. move return value d to M[retval]
2. restore locals , pc,sp,fp
callt d a ads ==
1. move parameters to temp
2. move temp to locals
efor d a b
== M[a]++; if M[a] <= M[b] goto d
case d lo hi ==
jump table is set of displacements
size of table is hi-lo+1
if lo <= d <= hi goto entry[lo-d+1]
else goto end of
table
where current pc is at the "case"
not d
a == M[d] = not M[a]
sys d a b
== syscall d, optional param: a,b
push
d == sp++; M[sp] = M[d]
mov d
a == M[d] = M[a] equiv.
to add d a #0
fun d
a == it stores arity (d), and
numlocal (a)
can be encoded into 24 bits
optional
set&jmp d a b == M[d] =
M[a]; goto b. use when enter for-loop
Opcode encoding
0..9
nop add sub mul div mod and or xor eq
10..19 ne lt le gt ge shl shr not
mov ldx
20..29 stx ud push call callt fun
ret efor case jmp
30..38 jt jf jeq jne jlt jle jgt
jge sys
Activation record in
vm v5
hi
<- sp
.. param
--------
fp'
<- fp
retads
lv
v_n
..
pv
v_1
--------
<- sp'
lo
The current frame stores: fp', retads. There is no need to store sp as
it tracks fp when ret. The pv+lv is saved registers (nlocal). Now
that a local is in an absolute place (M[0]...M[256]), no renaming of
registers during compilation is necessary.
Parameter passing
Two parameters can be passed via "call" instruction. The extras are
pushed to stack (via SS[sp]). Inside v5-vm, an array param[.] is used
to collect actual parameters to be instantiated to registers.
Constants as globals
To reduce the number of distinc instruction, the "immediate" mode can
be eliminated using constants as globals. It means all constants
will be stored in globals. For small constant, -10..300, it can
be stored permanently (immutable) in M[a]..M[b] so that the code
generator can be simplified. There is a direct relation between the
value and its address. For larger constants, they will be allocated and
stored as globals using the symbol table (like enum).
M[a]..M[b] that are initialised by compiler are used to store small
constants m..n. For large constants (out of range m..n) allocate
M[c] which can be retrieved by search symbol table. This way, the
access to symbol table will not be overwhelm as I expect 90% of
constants will be small and not require symbol table to retrieve them
during the compilation.
What should be m and n? -1 is used often, 256 seems to be the
largest small number (as least in our benchmark). So the range
-10..300 should cover most small numbers without being too many.
300 is used to signify a constant so a > 300. Let a = 390, b
will be 700.
-10..300 is M[390]..M[700] or
c is represented by M[400+c]
For large constants, DS is used to store them so they start at 1000
(the same as all other globals allocated by the compiler).
Comments
The call instruction uses 'b' as address to have 32 bits (other control
use 'd'). Some instructions are redundant: gt/ge (reverse arg of le/lt) but it makes code
generation a bit more complicate. Some instruction are
equivalent:
mov == add 0
jmp == jf 0
However making them distinct will make analysis easier. set&jmp is used when
entering for-loop (set initial index value and goto efor). It is also useful
in other situation such as while-loop.
To simplify code generation of "case" we put "jmp .." there, because
the "else" goes to the end of jump table. The jump table makes "case"
instruction a strange object of variable length in the code
segment. If "jmp" is used instead the size will be quite large (3
words per entry).
5 December 2009
Long live the King