Som v4.0

The 2007 series of Som are very exciting (Som v3.0 and Som v3.1). They are fast with new instruction set and improved compiler. With their performance comes the complexity. The sx-code of som v3.1 has 93 instructions and it needs a complicate code conversion to make use of tos register. I want to retain performance of 2007 series but I really want to make the instruction as simple as the original s-code (at least in terms of the number of instruction). To this goal I design an accumulator-based instruction set with one-address format.

Som v4.0 uses the new vm based on u-code. It achieves two objectives:
1) u-code instruction set is as simple as the original s-code. It has only 43 instructions with consistent format.
2) The compiler is much simpler than Som v3.1 and it produces a fast code. Its running time on benchmark programs is faster than Som v.3.0 and it is comparable to Som v 3.1. (v31 is 10% faster).

Som v4.0 compiler resembles more to Som v3.0 than to Som v3.1 because v3.1 has a complex handling of conversion and forward call. v4.0 symbol table is much better than any previous version (see doc/som-v35-symtab.txt). The handling of allocating temporary variables is much simpler than expected. The optimisation is done more thoroughly than any previous version.

How to

The executable code is in the directory /test. To compile and run do:

> som40 source.txt

The compiler uses som40.obj (som-compiler object code) and lib2.som must be presented in the same directory of the source files. Here are options:

> som40 -?
som40          interactive mode
som40 file     compile and run, output .lst and .obj
som40 -x file execute obj file
som40 -?       this help

The som-compiler is presented in the director /som. For more detailed discussion of the code development see /doc.

Benchmark result

V.4.0 is compared to v.3.0 and v.3.1. This is comparing sx-code with u-code. bubble sorts 20 items. matmul do 8x8. queen is an old one without macro. queen2 with macro solving all solutions (92). quick sorts 100 items.

    comparing v3 v31 v4 (noi)
            v3        v31       v4
bubble     6594      6152      6172
matmul     9345      8526     10982
queen    371190    353480    418401
queen2   242974    225264    236233
quick     32167     32066     26471

normalise to v3

             v3 v31/v3 v4/v3
bubble    1.00    0.93    0.94
matmul    1.00    0.91    1.18
queen     1.00    0.95    1.13
queen2    1.00    0.93    0.97
quick     1.00    1.00    0.82
average   1.00    0.94    1.01

In terms of the number of instruction executed, v3.0 and v4.0 is about the same. v3.1 is slightly better than both. v3.0 and v3.1 have the same instruction set. v.3.1 has a few "enhanced" instructions which improved the speed of execution by using top-of-stack. The difference between v3.0 and v.3.1 noi is due to the compiler. v.4.0 has a different instruction set. It uses more "put" to move things around the accumulator.

Comparing the running time of v3.0 v3.1 v4.0. (ms on Dell D500 1.3GHz Pentium M)

                 v3    v31   v4
bubble x1000    251    101    110
matmul x1000    358    150    221
queen x100     861    701    871
queen2 x100     591    441    471
quick x1000    763    470    411

The speedup of v.4.0 over v.3.0 is (1-v4/v3). v4.0 is 32% faster than v3.0. However, v4.0 is slower than v3.1. The speedup of v3.1 over v.4.0, (1-v31/v4), is 10%.

Object file format of Som v4.0

magic                                      5678940
start end (op arg)*                  code segment
start end data*                        data segment
size (name type ref arity lv)*   symbol table

Number of instructions

s-code        40
sx-code     85
som 3.1     93
u-code       43

2 July 2008
<correcting the speedup figures 5 Aug 2008>