runtime performance of  som 4.0

compare the running time of som 4.0 with som 3.1

base line som 3.1 (som16t from fast-eval.txt)
Dell D500 
som compile itself once (with output)   222 ms
(no output)
bubble.txt  x1000  134 ms.
queen2.txt  x100   557 ms.

run time on toshiba satellite windows 98
with all optimisations

quick (x10000)      1626 ms.
hanoi (x10000)      1176 ms.
bubble (x10000)     4560 ms.
queen2 (x100)       1604 ms.
compiler             650 ms.
compiler (no output) 386 ms.

scale it to 
Dell D500
use bubble, queen2 and compiler
bubble  134/456  = 0.29
queen2  557/1604 = 0.35
compiler 222/650 = 0.34
average  0.32

so base line som16t to Dell D500

quick (x10000)       520 ms.
hanoi (x10000)       376 ms.
bubble (x10000)     1459 ms.
queen2 (x100)        513 ms.
compiler             208 ms.
compiler (no output) 123 ms.

use clock() in time.h measure only runtime 
no opt. in C compiler

bubble (x10000)  1132, 1142, 1131 = 1135 ms.
quick (x10000)   401, 400, 401 = 401 ms.
hanoi (x10000)   321, 321, 330 = 324 ms.
queen2 (x100)    491, 521, 480 = 497 ms.
compiler         281, 280, 291 = 284 ms.

Note: the som 4 compiler execute more noi than som 3.1 when compile itself 5857067  vs. 5197696 (from som-v31-improve.txt). Therefore, it is a bit slower. However, they (som4 compiler and som3.1 compiler) are not the same.  I have not yet analysed the difference between two compilers.

  (time_som4/time_som31)

bubble      77.8%
quick       77.1%   
hanoi       86.1%
queen2      96.8%
compiler   136.4%

excluding the compiler, speedup is (1-time) = 15.5%

In terms of number of instruction executed (noi), we use som16y+/- as based.  som 3.1 seems to inherit this version.

            16y+  16y-   4.0
   
bubble     5434   6594   6172 
hanoi      1395   1647   1403
matmul2   10106  10106  10564
perm       2432   2826    --
queen2   223001 244927 236237
quick      2089   2406   2071
sieve      7209   8446   7997
aes4      18217  20559    --

Conclusion

In terms of running time, som 4.0 is 15% faster than som 3.1.  In terms of noi, it is comparable. noi is between som 16y+ and som 16y-.

som v4.0 achieves the objectives of 
1) small number of instruction (almost s-code)
2) as fast as som v 3.1  (15% faster)

number of instruction

s-code   40
sx-code  85
som 3.1  93
som 4.0  43

I should further analyse noi of the compiler.  Why it is more than som 3.1 compiler? (12% more).  The som 4.0 compiler is much less complex than som 3.1 compiler.  It has no rename, no converter etc.

1 July 2008
