sx2 chip
Data path
sx2 is sx with stack frame caching with the size 4. The
additional units are: v[], spu. v[] has wires: vin vout, at
vin the latch signals are (lv1 lv2 lv3 lv4 lvarg), at vout the mux "w"
has selectors: w.v1 w.v2 w.v3 w.v4 w.arg.
The spu (sp unit) has the following controls: sp.inc sp.dec
sp.bus. The output wire is "spout", when sp.inc the output is
sp+1 at phi1 and sp is updated at phi2 (pre-increment), when sp.dec,
phi1 the output is sp, at phi2 the sp is updated with p-1
(post-increment), when sp.bus sp is updated at phi2 with bus, normally
spout is sp. The internal wire in spu is: spin. The internal muxs
are sin, sout. The sin has two inputs: incrementor/decrementor
and bus. The sout has two inputs: sp+1 and sp.

<insert the data path diagram of sx2 here>
When sp.inc the following events are asserted:
sp.inc
$1 sout.sp+1, sin.sp+1
$2 lsp
sp.dec
$1 sout.sp, sin.sp-1
$2 lsp
sp.bus
$1 sout.sp, sin.bus
$2 lsp
14 July 2006
(Chulalongkorn Graduation Ceremony)
Performance
performance of sx1 vs sx2
sx2 uses fun.p.u.k, and if p=0
the compiler is som16 which does tail call elimination, macro, and
inc/dec.
program
noi/clk/cpi
sx1 sx2
bubble 20 10068
44214 4.39 10262 32090 3.13
hanoi 6
2312 10092 4.37 2377 7544 3.17
matmul 4x4 3043
12880 4.23 3097 9348 3.02
perm
4 4868 20932 4.30
4935 14663 2.97
queen all 618665 2576210 4.16
620724 1717782 2.77
quick 20
3172 13539 4.27 3224 9551 2.96
sieve 1000 28026 124338
4.44 28029 75204 2.68
aes4 som16 30579 131560
4.29 30724 90498 2.95
clock sx2/sx1 average = 0.70
Conclusion
in term of number of clock, sx2 is 30% less than sx1.
average cpi of sx1 4.3, sx2 2.9
comparing the size of microprogram
sx1 width 38 length
62 38x62 = 2356
sx2
71 74 71x74 = 5254
sx2 control size is double of sx1
8 August 2006