som v1.5 revision 2

Retain som-v1 simplicity and separate another module to do flow analysis and code optimisation.

Compiler

som v1.5 needs to do source-level optimisation, such as, macro expansion.  It outputs the s-code core (the minimum s-code).  som v1.5 does the following:

1   eliminate callt
2   macro expansion (source-level, can not be done elsewhere).

The compiler will not do peep hole optimisation such as combination jump (no jle).

Optimiser

The next module, called "som optimisation" or "somx", does code optimisation at the level of s-code plus s-extension (sx).  s-code -> s-code + sx

1   flow analysis to do ret retv
2   static array 
3   short cut jumps
4   customised for individual project (chip) 
The customisation are numerous such as 1   combination jump (jle), 
2   addi, addv, 
3   strength reduction 

Strength reduction is powerful for structure access, for example, a[0] ->  ads.a lit.0 ldx -> ld.#a  when #a is the content (base address) of "a" known at compile time.  Hence this module can be easily customised to each chip.

Code generation

After this second module, the next one is the code generation, s-code + sx ->  actual machine code.  The machine code has different encoding for different chips, including some different instruction (such as multibit-shift, crc).  Special instructions can be "included" in the s-code at the code optimisation module by creating a function, then at the time of code generation, the "call fx" will be replaced by the special instruction "fx" at machine code level (call ->primitive).

Chip simulator

The last module is the chip simulator which accepts machine codes from the code generator and decodes and executes them.  As this module contained decoder, it can generates the listing for machine code to check the correctness of the code and the decoder.   The shared functions between different chips are:  loadobj(), loadSymTab(), s-core + sx header (icXXX) and most prCode() will be quite similar.

5 Mar 2005

Work left to be done

1  Relocation of DS and string

Relocation of DS is simple as the address is a part of the instruction  ld/st.  However, the dynamic allocation for string constant is still not clear.  A string space is allocated dynamically (see dostring() ).  Currently, the constant pointer of a string is generated as a "lit".  Perhaps a new instruction to indicate string is required, "sc" (similar to "ads").  

2   static array 

Snapshot replaces the immediate line.  The allocation static array  (array that its size is known at compile time) is simple and its access function can be strength reduce.  "ads" is required at this stage for relocation in the code generation phase.  Another kind of static array is the array of constant, such as the lookup table (for example in AES block cipher).  It is worth considering how to make a nice syntax for it.  The current experiment is:

a = array { 1 2 3 4 }

Which is very nice, but it is dependent on the element size (an integer).  Should the language be made size independent?  

3   layout of object file

There are several objects to be included in the object file.  Currently, there are the executable object and the symbol table.  Other necessary information are:

3.1  the start of DS and its size
3.2  constant array
3.3  string constant

The new layout should be designed.

6 Mar 2005



