converter


Conversion from S-code to SM-code is mostly straight forward as Som4 already generates a suitable code for SM.  Only a few instructions that exist on Som are different from SM because SM is more restricted. 

Conversion

*na not applicable

S-code        SM-code

1  add         xadd    
2  sub         xsub
3  mul         *na (1)
4  div         *na (1)
5  band        xand
6  bor         xor
7  bxor        xxor
8  not         xnot
9  eq          xeq
10 ne          xne
11 lt          xlt
12 le          xle
13 ge          xlt, xnot
14 gt          xle, xnot
15 shl         xshl* (2)
16 shr         xshr* (2)
17 mod         *na (1)
18 ldx         *na (3)
19 stx         *na (3)
20 ret         xret or xretv (4)
[21 - ]        -
22 array       *na
[23 - ]        -
24 get         xget
25 put         xput
26 ld          *na (5)
27 st          *na (5)
28 jmp         xjmp
29 jt          xjt
30 jf          xjf
31 lit         xlit
32 call        xcall
33 callt       xjmp loop (6)
34 inc         xget v, xinc, xput v (7)
35 dec         xget v, xdec, xput v (7)
36 sys         xsys
37 case        *na (8)
38 fun         xfun
42 ads         xlit
43 lda         xld
44 sta         xst

Differences of SM-code from S-code

  1. Arithmetic is more restricted in SM.  There are no: mul, div, mod.
  2. Shift operation in SM is one bit only.  In S-code shift takes the number of bit to be shifted from TOS.  That is checked by the converter in case of known, the shift instruction is immediately preceed with "lit k" into a serie of k shifts. Otherwise it is an error.
  3. No ldx, stx are generated by Som4 compiler as they are converted into Ads.., get index, add, lda/sta. The converter will never see ldx, stx.
  4. Converter needs to generate proper ret or retv.  See more detail in the next section.
  5. Similar to 3, ld/st is not generated by Som4.  Som4 generates Ads .., lda/sta
  6. tail-call, the converter must generates explicit parameter passing and jump to the beginning of the code. See the next section.
  7. inc, dec of S-code are difference from SM-code.  in S-code, inc/dec operates on a local variable.  The SM inc/dec operates on TOS.  So the conversion is:
inc v =>  xget v, xinc, xput v
dec v =>  xget v, xdec, xput v
  1. "case" is not supported by SM chip.  Conversion is possible but too troublesome.

ret/retv

To realise SM, complicate instructions such as "ret" is decomposed into "xret" and "xretv", so that each instruction is simple enough to have a short control sequence for it.  The converter must analyse the call chain to find out whether a function returns a value or not.  This is done in "analyse( )" (see "what does cv3 do?" below). The deduction is based on examining the instruction preceeding the "ret", if it is: st, put, jmp, jt, jf, ret, inc, dec, sys then the function returns a value, otherwise it does not.  Once the status whether the function returns a value or not of the terminal function is known, all functions in the call chain can be deduced.

tail-call

A recursive tail-call will be converted into passing parameters and then instead of a recursive call, use "jump" to the beginning of function reusing the old activation record. The following example shows how a tail-call is converted.

[from hanoi.txt]

to mov n from t2 | other =
  if n == 1
    num[from] = num[from] - 1
    num[t2] = num[t2] + 1
    print from printc 58 print t2 nl
  else  
    other = 6 - from - t2
    mov n-1 from other
    mov 1 from t2
    mov n-1 other t2

in S-code:

     ...

     68 Lit 1        // 1
     69 Get 3        // from
     70 Get 2        // t2
     71 Call mov
     72 Get 4        // n - 1
     73 Lit 1
     74 Sub
     75 Get 1        // other
     76 Get 2        // t2
     77 Callt mov
     78 Ret 5

In SM-code, the last "mov" (77 Callt mov) is converted to:

   38 Fun mov
   40 Get 4
   42 Lit 1

    ...

  121 Lit 1
  124 Get 3
  126 Get 2
  128 Call mov
  131 Get 4
  133 Dec        // n - 1
  134 Get 1      // other
  136 Get 2      // t2
  138 Put 2      // -> t2
  140 Put 3      // -> other
  142 Put 4      // -> n
  144 Jmp 40     // goto begin
  147 Ret 5

Som4 code generator

generate code that is suitable to be converted for smc

ld x              =>   ads x, lda
value, st x       =>   ads x, value, sta
ads,idx,ldx       =>   ads, idx, add, lda
ads,idx,value,stx =>   ads, idx, add, value, sta

for static array, som4 eval() that expression to know the size of array and the actual pointer of that array and generate:
                  =>   ads array, ads pointer, sta

it also outputs the symbol table which include ref to static array in the form:
    field :name,type,ref,arg,lv
    value  name GVAR ref 1 pointer

This information is used by cv3 to generate an efficient code to access the static array.

what does cv3 do?

loadobj()
  loadsymtab()
     which contains sym_entry symtab[] with field:
       name,type,ref,arg,lv   #nsym
  load object code into
    aop[], aarg[], with length #alen

analyse()

  collect function into flist[] with field:
    ref,retv,callee   #nfunc
  deduce whether function is retv by
  findRet()  check if this func is ret, retv, unknown
     if the code before ret is st,put,jmp,jt,jf,ret,inc,dec,sys
     then it is ret
     if it is a call then unknown
     else it is retv
  callchain()  follows its callee, check if it is ret/retv then
    set 1) its caller, 2) itself
  finalise by setting all unknown to ret  

cvA2SM()

  generate sm code
  relocate data segment from som4 to 0, and reloc to a new address
    set by DS
  update reloc ads[]   ip is address of A-code, pc of sm
  at code Ads x, it checks if x is a static array, chkarray() and
    generate  Lit pointer for efficient access.
  shr/shl only in the case of known number of bit to be shifted
    and generate shr/shl^n

doretv()

  scan the body of function and change ret to retv (in sm)

reloc()

  do all CS change address of call,jmp,jt,jf in cbuf[] to sm
    address uses ads[].

listing()

  prcode()
    call, fun use ref to find symbolic name from symtab[]
  prlit()
    print symbolic name of literal, GVAR and GVAR static array
    GVAR its ref will be matched
    GVAR static array  has ARG == 1 and ref matches with LV

SM code optimisation

improve performance  sm3x

Observing the profile of running aes2.txt

xLd 2400
xSt 1950
xshr 820
xshl 1260
xinc 1410
xdec 80
xadd 5591
xsub 40
xand 584
xxor 1004
xeq 40
xlt 199
xle 1260
xRet 84
xHalt 1
xGet 10419
xPut 3282
xSys 39
xLit 7047
xJmp 224
xJt 1459
xJf 40
xCall 84

total 39317 instruction 480719 cycles
inst. fetch 259649 cycles   execution 221070 cycles
read CS 70933 times  DS read 30470  write 22950  total 53420

to improve performance, we concentrate on ld/st indexing (ld, st, add, get) as they are used most frequently in aes2.txt.  for sm3x we add two instructions:

ldxv  load index local var (ads -- value)
  idx = M[fp+v]
  ts = M[ts+idx]

stxv  store index local var (ads, value --)
  idx = M[fp+v]
  a = M[sp] + idx
  sp = sp - 1
  M[a] = ts
  popts

example of use:

 state[i] = state[k]

   lit &state, get i, add, lit &state, get k, add, ld, st
becomes
   lit &state, lit &state, ldxv k, stxv i

these new instructions will save four instructions in the above example.

One alternative that we have tried is to substitute a base address of an array (which is a global variable) with a value in a local variable.

  state[i] = state[k]

becomes, where ss is a local variable
 ss = state
 ss[i] = ss[k]

state[.] is "lit &state", takes 14 clocks.  "get ss" takes 13 clocks, not a significant improvement (but it is easy).

As compilation for stxv affects large number of programs, back to som4, we will consider a simpler modification. from the example:

 state[i] = state[k]

   lit &state, get i, add, lit &state, get k, add, ld, st

the load is easy, just recognise
 
   get k, add, ld => ldxv k

not so easy for the store, so just do what can be done simply:

   get i, add => addv i

which is possible by just looking back one instruction. modify cv3 to do this is easy.  the above example becomes:

   lit &state, addv i, lit &state, ldxv k, st

the saving is not as much as stxv but it still saves one "add".

addv v,
  ff = M[fp+v]
  ts = ts + ff

Result

not counting function "initial" (ads 1335-4636)

sm3
total 38854 instruction 474550 cycles (t2)

sm3x with addv and ldxv
total 34720 instruction 426760 cycles (t1)

speedup   1 - (t1/t2) = 10%

Object code

The object file format (S-code) is :

<no. of symbol>
<symbol table>
<number of code>  each is an integer
<code code .....>

[from hanoi.obj]
8
...
num 8 1000 1 1001
mov 6 19 3 4
run 6 79 0 1
_main 6 111 0 0

133
30239 21 286 1 1 7 278 305 17 542
1 1 7 278 561 17 798 0 0 7 . . .

To read an object code use the following:

void readCode(FILE *fi){
    int i, n, c;
    fscanf(fi,"%d",&n);
    for(i=1; i<=n; i++){
        fscanf(fi,"%d",&c);
        CS[i] = c;
    }
}
where fi = fopen("filename","rt"); .  Remember that CS starts at 1 (not 0).

The object of SM-code is:

<start ads  lenght of code>
<code code ...>
<0 0>

0 223
132 0 209 63 133 1 64 255 66 1 15 254 133 1 64 255
66 2 15 254 133 1 128 0 32 66 2 15 255 133 1 128
...
0 0

8 October 2004
P. Chongstitvatana