sx2 chip

Data path

sx2 is sx with stack frame caching with the size 4.  The additional units are:  v[], spu.  v[] has wires: vin vout, at vin the latch signals are (lv1 lv2 lv3 lv4 lvarg), at vout the mux "w" has selectors: w.v1 w.v2 w.v3 w.v4 w.arg.  

The spu (sp unit) has the following controls: sp.inc sp.dec sp.bus.  The output wire is "spout", when sp.inc the output is sp+1 at phi1 and sp is updated at phi2 (pre-increment), when sp.dec, phi1 the output is sp, at phi2 the sp is updated with p-1 (post-increment), when sp.bus sp is updated at phi2 with bus, normally spout is sp. The internal wire in spu is: spin.  The internal muxs are sin, sout.  The sin has two inputs: incrementor/decrementor and bus. The sout has two inputs: sp+1 and sp.  

sx2 data path
<insert the data path diagram of sx2 here>

When sp.inc the following events are asserted:  
sp.inc
$1 sout.sp+1, sin.sp+1
$2 lsp

sp.dec
$1 sout.sp, sin.sp-1
$2 lsp

sp.bus
$1 sout.sp, sin.bus
$2 lsp

14 July 2006
(Chulalongkorn Graduation Ceremony)

Performance

performance of sx1 vs sx2

sx2 uses fun.p.u.k, and if p=0
the compiler is som16 which does tail call elimination, macro, and inc/dec.

program noi/clk/cpi    sx1        sx2

bubble 20  10068   44214 4.39  10262   32090 3.13
hanoi 6     2312   10092 4.37   2377    7544 3.17
matmul 4x4  3043   12880 4.23   3097    9348 3.02
perm 4      4868   20932 4.30   4935   14663 2.97
queen all 618665 2576210 4.16 620724 1717782 2.77
quick 20    3172   13539 4.27   3224    9551 2.96
sieve 1000 28026  124338 4.44  28029   75204 2.68
aes4 som16 30579  131560 4.29  30724   90498 2.95

clock sx2/sx1  average = 0.70

Conclusion

in term of number of clock, sx2 is 30% less than sx1.
average cpi of sx1 4.3, sx2 2.9

comparing the size of microprogram

sx1  width 38  length 62  38x62 = 2356
sx2        71         74  71x74 = 5254

sx2 control size is double of sx1

  8 August 2006