Analysing Som v.4.1 compiler
Here is the profile of running the compiler benchmark. Only the
functions that use more than 70K instructions are shown, sorting by the
noi used. (for the full profile see /doc/som41-comp-rep.txt)
name (ref)
cnt noi
lex1
(2112)
9098 713828
fprints
(214)
7950 434521
newcell
(644) 20944
397936
strpack
(308)
5315 271787
outM
(6814)
2 203510
prCode
(6572)
5589 201848
token
(1818)
6195 177840
bop
(5728)
3747 174543
lex
(2778) 10019
162493
genex
(8676)
5526 148805
hash
(1064)
5129 147210
streq
(60)
5265 138446
reloc
(10338)
1 125774
cons
(746) 11200
123200
install
(1208)
5129 107172
outa
(6000)
5741 103338
showCode (6766)
1
95020
ypush
(3318)
8621 94831
ypop
(3346)
8621 94831
newatom
(708)
7770 85470
term
(5368)
4243 71156
compare to som v.3.1 compiler, v41 is faster (less noi) in:
lex1,
lex, fprints, strpack, token, genex, hash, install
newcell, ypush are slower
due to no-stack (must use extra "put").
The conclusion is that the code generator is quite good and the u2
instruction set is reasonable. The som v.4.1 code generator
produces as efficient code as som v.3.1. The other difference is
due to the way the compiler is written. Both compilers are
different because they produce different codes.
To improve the compiler, one must look into how it is written and
change the way the compiler work. The best place to start is to look at
the profile.
improvement
Start with the most frequently used function: (starting with noi
5236307)
1) fprints (string-s.txt), change to use nested if. noi 5231285
2) strpack (string-s.txt), change to use nested if. (1)+(2) noi 5215634
x) newcell (list-s.txt), no way to improve
3) lex1 (token-s.txt), use macro and/or in hot spots. 1..3 noi 5162367
4) outM (icode-s.txt), opt. eqi 0, jf -> jt. 1..4 noi 5133416
5) prCode (icode-s.txt), use macro to decode arg, 1..5 noi 5100835
x) token (token-s.txt), no way to improve
x) lex (token-s.txt), no way to improve
6) bop (parse.som), this is an interesting case. This function is
generated by a parser generator. Hand writing some part of it is not
advisable (as the new one can be generated from a new grammar).
However, to see the effect of using "case" to replace a long sequence
of "if tok == xxx ..." which should have a dramatic speed up, I hand
code just this function. A better way is to write a new parser
generator in Som to generate a "case" list for it. 1..6 noi 4957654
As expected, bop cnt 3747 noi 174543 is reduced to mere 31896, 5
times reduction!
x) genex (gencode-s.txt), is too complex to be tampered with.
x) hash (symtab-s.txt), no way to improve
7) streq (string-s.txt), opt. jmp in macro, jx to lit 0, jt, $z =>
jx.z, 1..7 noi 4969571. This is strange, it is slower than 1..6!
It should not be so as "improv" where additional code for opt. is
appended, is used only 294 times (noi 63343). However the profile
revealed that, streq is improved a bit (noi -1226) but improv is worsen
(noi +4582) so there is net loss. I rewrite "improv" (there is a bug on
cascade or), now noi is 4956131.
x) cons (list-s.txt), no way to improve
x) reloc (main-s.txt), no way to improve
x) install (symtab-s.txt), has a few put.x/get.x, can be
optimised. However, I am not certain that it is totally save to
do so. Therefore, the opt. is not done.
8) term (parse.som), rewrite to use case, its noi drops from (cnt 4243)
71156 to 34903 (1/2), 1..8 noi 4919878
Here is the profile after all improvements above: (see the full
text at /doc/som44-rep.txt)
total noi 4919789
name
(ref)
cnt noi
lex1 (2112)
9098 642395
fprints (214)
7944 417746
newcell (644)
20944 397936
strpack (308)
5315 256136
outM (6848)
2 190918
token (1818)
6195 177840
prCode (6606)
5583 169045
lex (2778) 10019
161959
genex (8736)
5526 148805
hash (1064)
5129 147210
streq (60)
5265 137220
reloc (10396)
1 125645
cons (746)
11200 123200
install (1208)
5129 107172
outa (5996)
5741 103338
showCode (6800)
1 94918
ypush (3306)
8621 94831
ypop (3334)
8621 94831
So the result is 4919878 instructions versus 5236396 in the
beginning. This is 6% reduction in noi. from the effort of
rewriting the source code (an afternoon job) and a bit of code
generator improvement.
Because the aim is to improve it until it is comparable to som v.3.1
(noi 5093134) , the effort is stopped here.
5 Aug 2008