converter
Conversion from S-code to SM-code is mostly straight forward as Som4
already generates a suitable code for SM. Only a few instructions
that exist on Som are different from SM because SM is more
restricted.
Conversion
*na not applicable
S-code
SM-code
1
add xadd
2
sub xsub
3
mul *na (1)
4
div *na (1)
5 band
xand
6
bor xor
7
bxor xxor
8
not xnot
9
eq xeq
10 ne
xne
11
lt xlt
12
le xle
13
ge xlt, xnot
14
gt xle, xnot
15
shl xshl* (2)
16 shr
xshr* (2)
17
mod *na (1)
18
ldx *na (3)
19
stx *na (3)
20
ret xret or xretv (4)
[21 -
] -
22 array
*na
[23 -
] -
24
get xget
25
put xput
26 ld
*na (5)
27
st *na (5)
28 jmp
xjmp
29
jt xjt
30
jf xjf
31
lit xlit
32
call xcall
33 callt
xjmp loop (6)
34
inc xget v, xinc, xput v (7)
35
dec xget v, xdec, xput v (7)
36
sys xsys
37
case *na (8)
38
fun xfun
42
ads xlit
43
lda xld
44
sta xst
Differences of
SM-code from S-code
- Arithmetic is more restricted in SM. There are no: mul,
div, mod.
- Shift operation in SM is one bit only. In S-code shift
takes the number of bit to be shifted from TOS. That is checked
by the converter in case of known, the shift instruction is immediately
preceed with "lit k" into a serie of k shifts. Otherwise it is an error.
- No ldx, stx are generated by Som4 compiler as they are converted
into Ads.., get index, add, lda/sta. The converter will never see ldx,
stx.
- Converter needs to generate proper ret or retv. See more
detail in the next section.
- Similar to 3, ld/st is not generated by Som4. Som4
generates Ads .., lda/sta
- tail-call, the converter must generates explicit parameter
passing and jump to the beginning of the code. See the next section.
- inc, dec of S-code are difference from SM-code. in S-code,
inc/dec operates on a local variable. The SM inc/dec operates on
TOS. So the conversion is:
inc
v => xget v, xinc, xput v
dec v => xget v, xdec,
xput v
- "case" is not supported by SM chip. Conversion is possible
but too troublesome.
ret/retv
To realise SM, complicate instructions such as "ret" is decomposed into
"xret" and "xretv", so that each instruction is simple enough to have a
short control sequence for it. The converter must analyse the
call chain to find out whether a function returns a value or not.
This is done in "analyse( )" (see "what does cv3 do?" below). The
deduction is based on examining the instruction preceeding the "ret",
if it is: st, put, jmp, jt, jf, ret, inc, dec, sys then the function
returns a value, otherwise it does not. Once the status whether
the function returns a value or not of the terminal function is known,
all functions in the call chain can be deduced.
tail-call
A recursive tail-call will be converted into passing parameters and
then instead of a recursive call, use "jump" to the beginning of
function reusing the old activation record. The following example shows
how a tail-call is converted.
[from hanoi.txt]
to
mov n from t2 | other =
if n == 1
num[from] =
num[from] - 1
num[t2] =
num[t2] + 1
print from
printc 58 print t2 nl
else
other = 6 -
from - t2
mov n-1 from
other
mov 1 from t2
mov n-1 other
t2
in S-code:
...
68 Lit
1 // 1
69 Get
3 // from
70 Get
2 // t2
71 Call
mov
72 Get
4 // n - 1
73 Lit 1
74 Sub
75 Get
1 // other
76 Get
2 // t2
77 Callt
mov
78 Ret 5
In SM-code, the last "mov" (77 Callt mov) is converted to:
38 Fun mov
40 Get 4
42 Lit 1
...
121 Lit 1
124 Get 3
126 Get 2
128 Call mov
131 Get 4
133 Dec
// n - 1
134 Get 1
// other
136 Get 2
// t2
138 Put 2
// -> t2
140 Put 3
// -> other
142 Put 4
// -> n
144 Jmp 40
// goto begin
147 Ret 5
Som4 code generator
generate code that is suitable to be converted for smc
ld
x
=> ads x, lda
value, st
x => ads x, value,
sta
ads,idx,ldx
=> ads, idx, add, lda
ads,idx,value,stx
=> ads, idx, add, value, sta
for static array, som4 eval() that expression to know the size of array
and the actual pointer of that array and generate:
=> ads array, ads
pointer, sta
it also outputs the symbol table which include ref to static array in
the form:
field
:name,type,ref,arg,lv
value
name GVAR ref 1 pointer
This information is used by cv3 to generate an efficient code to access
the static array.
what does cv3 do?
loadobj()
loadsymtab()
which
contains sym_entry symtab[] with field:
name,type,ref,arg,lv #nsym
load object code into
aop[], aarg[],
with length #alen
analyse()
collect function into
flist[] with field:
ref,retv,callee #nfunc
deduce whether function is
retv by
findRet() check if
this func is ret, retv, unknown
if the
code before ret is st,put,jmp,jt,jf,ret,inc,dec,sys
then it
is ret
if it is
a call then unknown
else it
is retv
callchain() follows
its callee, check if it is ret/retv then
set 1) its
caller, 2) itself
finalise by setting all
unknown to ret
cvA2SM()
generate sm code
relocate data segment from
som4 to 0, and reloc to a new address
set by DS
update reloc
ads[] ip is address of A-code, pc of sm
at code Ads x, it checks
if x is a static array, chkarray() and
generate
Lit pointer for efficient access.
shr/shl only in the case
of known number of bit to be shifted
and generate
shr/shl^n
doretv()
scan the body of function
and change ret to retv (in sm)
reloc()
do all CS change address
of call,jmp,jt,jf in cbuf[] to sm
address uses
ads[].
listing()
prcode()
call, fun use
ref to find symbolic name from symtab[]
prlit()
print symbolic
name of literal, GVAR and GVAR static array
GVAR its ref
will be matched
GVAR static
array has ARG == 1 and ref matches with LV
SM code optimisation
improve performance sm3x
Observing the profile of running aes2.txt
xLd
2400
xSt 1950
xshr 820
xshl 1260
xinc 1410
xdec 80
xadd 5591
xsub 40
xand 584
xxor 1004
xeq 40
xlt 199
xle 1260
xRet 84
xHalt 1
xGet 10419
xPut 3282
xSys 39
xLit 7047
xJmp 224
xJt 1459
xJf 40
xCall 84
total 39317 instruction 480719 cycles
inst. fetch 259649 cycles execution 221070 cycles
read CS 70933 times DS read 30470 write 22950 total
53420
to improve performance, we concentrate on ld/st indexing (ld, st, add,
get) as they are used most frequently in aes2.txt. for sm3x we
add two instructions:
ldxv
load index local var (ads -- value)
idx = M[fp+v]
ts = M[ts+idx]
stxv
store index local var (ads, value --)
idx = M[fp+v]
a = M[sp] + idx
sp = sp - 1
M[a] = ts
popts
example of use:
state[i] = state[k]
lit &state, get i, add, lit
&state, get k, add, ld, st
becomes
lit &state, lit &state, ldxv
k, stxv i
these new instructions will save four instructions in the above example.
One alternative that we have tried is to substitute a base address of
an array (which is a global variable) with a value in a local variable.
state[i] = state[k]
becomes, where ss is a local variable
ss
= state
ss[i] = ss[k]
state[.] is "lit &state", takes 14 clocks. "get ss" takes 13
clocks, not a significant improvement (but it is easy).
As compilation for stxv affects large number of programs, back to som4,
we will consider a simpler modification. from the example:
state[i] = state[k]
lit &state, get i, add, lit
&state, get k, add, ld, st
the load is easy, just recognise
get k, add, ld => ldxv k
not so easy for the store, so just do what can be done simply:
get i, add => addv i
which is possible by just looking back one instruction. modify cv3 to
do this is easy. the above example becomes:
lit &state, addv i, lit
&state, ldxv k, st
the saving is not as much as stxv but it still saves one "add".
addv v,
ff = M[fp+v]
ts = ts + ff
Result
not counting function "initial" (ads 1335-4636)
sm3
total 38854 instruction 474550 cycles (t2)
sm3x with addv and ldxv
total 34720 instruction 426760 cycles (t1)
speedup 1 - (t1/t2) = 10%
Object code
The object file format (S-code) is :
<no. of symbol>
<symbol table>
<number of code> each is an integer
<code code .....>
[from hanoi.obj]
8
...
num 8 1000 1 1001
mov 6 19 3 4
run 6 79 0 1
_main 6 111 0 0
133
30239 21 286 1 1 7 278 305 17 542
1 1 7 278 561 17 798 0 0 7 . . .
To read an object code use the following:
void
readCode(FILE *fi){
int i, n, c;
fscanf(fi,"%d",&n);
for(i=1;
i<=n; i++){
fscanf(fi,"%d",&c);
CS[i] = c;
}
}
where fi = fopen("filename","rt");
. Remember that CS starts at 1 (not 0).
The object of SM-code is:
<start ads lenght of code>
<code code ...>
<0 0>
0
223
132 0 209 63 133 1 64 255 66 1 15
254 133 1 64 255
66 2 15 254 133 1 128 0 32 66 2
15 255 133 1 128
...
0 0
8 October 2004
P. Chongstitvatana