to improve compiler

there are a lot of repeat scanning of object code for:
1)  rename
2)  convert to executable
3)  reloc

All of these tasks can be done on a single scan of object code provided that the object file and the listing file are slightly changed.  A function will be process to completion one-by-one including: rename, convert, reloc.  Because a forward function definition may defer a function to be processed, the order of completion may not follow the order of appearance in the source file.  Hence, the order of a function in the object file (and the listing file too) will be affected accordingly.  The new object code needs to be multiple blocks with a zero to indicate the end of code.

magic
[ start end (op arg)* ]*   -- code segment
0
start end data*            -- data segment
symtab

How a function is processed

loadfile
  parse
  showcode
  cvfun
  runimm

parse
  parse source and gencode one-by-one function
  at the end of a function
    if no forward ref then add to funlis
    else add to ufunlis

gencode a function
  genex
  patchbreak
  tailcall
  rename
  improve

cvfun
  convert all functions in funlis
  convert do
    if iccall process retv
    change op
    change arg
  
runim
  patchcalli
  genex
  convert
  eval

I will not consider runim.

step of work

1  convert one function at a time (do not use funlis)
2  cvx can be rolled into convert
3  rename and showcode can be done at convert
4  reloc may be difficult to roll into convert as it changes the content of M[] for static array and ads.

24 Aug 2007

What has been done?

1  gencode and convert one function at a time with output to the listing.  A defer function is kept on ufunlis to be converted at "patchcalli" at the end of a source file when runim (just before execution). No collecting convertible functions into funlis.  This is all done at "clean".

2  rename will also collect jplis, the list of jump addresses (previously done separately in "getJmp" to jvec[]). 

3  cvx is rolled into convert.  The most change is in "convert". It has been simplified and clarified, instead of "checking" the precondition TF, it "sets" the TF. Most codes use a look up table (TF_state) to set TF to the state after the instruction is executed. Only the special codes need processing.

Conclusion

The intension to "merge" rename, convert and reloc into one loop is not carried out.  I feel that the code will be unreadable.  The gain in speed from reducing the overhead of scanning the code is not much.  So, the "improvement" in this version is mostly cleaning up the "convert" and tidy up the code here and there.

Benchmark

using som-v3 som.txt (som-compiler) as a benchmark program.  Use
> som31 (-c som30.obj) som.txt

noi is 4998307

compare to what?  Compare to the object code version just before the public release (som-v16\som-v16s\som3\som30.obj  compiled on 21 Aug 2007 13:15).  This version is a correct working code for the vm som16s which becomes som v3.1 vm.  The som-compiler source has not been "improved" beyond just making it work with the new vm.  So, it should be a good comparison between a plain compiler and an improved version for the same vm.  (the object code 21 Aug has a different format of the symbol table. I modified som31 to accommodate that in order to run the experiment).

noi is 5174640
 
speed up (1- noi_plain/noi_improve) 3.5%

using the latest som-compiler (som-v31\som\) as the benchmark

plain       noi 5398190
improved    noi 5197696
speedup     3.9%

the size of the executable codes of compilers are also interesting:

plain      CS  5953   DS  1153
improved   CS  5578   DS  1229

the improved version has smaller object code but uses larger data (the static array, mostly table look up).

the latest compiled (no full macro) is noi 5093339

25 Aug 2007

Macro

Now it is time to come back to implement a full macro.  After a careful inspection of how macro work, I found that the meaning of a full macro is different from a normal macro.  Consider this example of swapping two variables (as in bubble sort):

to swap ar i j | t =
  t = ar[i]
  ar[i] = ar[j]
  ar[j] = t

it is probably called from:

  swap ax a b 

If swap is a full macro it will work as expect. However if it is a normal macro it will not work at all because there is no where for the local "t" to participate in substitution of variables.  A normal macro "swap" (as written above) will be expanded (incorrectly) to:

  ? = ax[a]
  ax[a] = ax[b]
  ax[b] = ?

where as if "swap" is written as a normal macro, it will be:

: swap i j t =
  t = i
  i = j
  j = t

If it is called with a supply of variables as:

  swap ax[a] ax[b] c

and by "textual" substitution (which is how a normal macro is expanded), the expanded expression will be:

  c = ax[a]
  ax[a] = ax[b]
  ax[b] = c

which will work correctly.

The meaning of a normal macro is therefore not the same as an expression in the language.  A full macro is supposed to be the "inline" of the expression. The meaning of a full macro is the same as the expression in the language.  So, the meaning of a full macro and a normal macro are different. It is not good to have a syntax construction that has two meanings in the same language.  I have to choose only one.  A normal macro is simpler to compile and very powerful. It is also proved to be much more useful (see the som-compiler itself never use a full macro).  Therefore I choose to adopt a normal macro and discard a full macro from the language.

A normal macro is similar to a macro in C language as it is a textual substitution.  I have not yet pondered their difference in full.  A macro in C has been considered by many language designers as "not recommend".  An expression written in a macro in C is different from a normal expression and can caused a subtle bug if it is used carelessly.  

26 Aug 2007

