v5 compiler code optimisation


The following rules are used in v5 compiler:

1.  jmp to ret -> ret
2.  jc to jmp.y -> jc.y

macro and (and (and x y) z)

3.  jc to mov v #0, jf.y v -> jc.y
4.  jc to mov v #0, jt.y v, $ -> jc.$

macro or

5.  mov v #1, jmp to jf.y,$ -> jmp.$
6.  mov v x, jmp to jf.y,$ -> jf.y x, jmp.$

7.  lop v, jmp to jf.y v, $ -> jinvlop.y, jmp.$

These rules proved to be crucial in making "queen2" benchmark as fast as it is now.  A general observation that lead to these rules are discussed below.

Observing the code generator performing expansion of macro and/or, one can notice that it leaves a lot of room for code optimisation.

macro and

and (and x y) z

   jf @1 x   <1> -> jf @3 rule 1
   mov rv y  <2> -> jf @3 y, jmp @$ rule 6
   jmp @2
@1 mov rv #0 <3> -> jmp @3 rule 2
@2 jf @3 rv
@$ mov rv z
   jmp @4
@3 mov rv #0
@4 ret rv

rule 1:  #0, jf -> nil
  jc to mov v #0, jf @1 v -> jc @1

rule 2:  #0, jf -> jmp
  mov v #0, jf @1 v -> jmp @1

------------------------------
and x (and y z)

   jf @1 x
   jf @2 y
   mov rv z
   jmp @3    <1> -> jmp @4 rule 3
@2 mov rv #0
@3 jmp @4
@1 mov rv #0
@4 ret rv

rule 3:  jc to jmp -> jc
  jc to jmp @1 -> jc @1

-----------------------------
and (or x y) z

   jf @1 x
   mov rv #1 <1> -> jmp @$ rule 4
   jmp @2
@1 mov rv y 
@2 jf @3 rv
@$ mov rv z
   jmp @4
@3 mov rv #0
@4 ret rv

rule 4: #1, jf -> nil
  mov v #1, jmp to jf v, $ -> jmp @$

-----------------------------
and x (or y z)

   jf @1 x
   jf @2 y
   mov rv #1
   jmp @3    <1> ->  jmp @4 rule 3
@2 mov rv z
@3 jmp @4
@1 mov rv #0
@4 ret rv

similar to (and x (and y z))

----------------------------

macro or

or (or x y) z

   jf @1 x
   mov rv #1 <1> -> jmp @$ rule 5
   jmp @2
@1 mov rv y
@2 jf @3 rv
@$ mov rv #1
   jmp @4
@3 mov rv z
@4 ret rv

rule 5: #1, jf -> jmp
  mov v #1, jmp to jf v,$ -> jmp @$

----------------------------
or x (or y z)

   jf @1 x
   mov rv #1 <1> -> jmp @$ rule 5
   jmp @2
@1 mov rv y
@2 jf @3 rv
@$ mov rv #1
   jmp @4
@3 mov rv z
@4 ret rv

----------------------------
or (and x y) z

   jf @1 x    <1> -> jf @3 rule 1
   mov rv y   <2> -> jf @3 y, jmp @$ rule 6
   jmp @2
@1 mov rv #0
@2 jf @3 rv
@$ mov rv #1
   jmp @4
@3 mov rv z
@4 ret rv

rule 6:
  mov v y, jmp to jc @1 v, $ -> jc @1 y, jmp @$

---------------------------
or x (and y z)

   jf @1 x
   mov rv #1
   jmp @2
@1 jf @3 y
   mov rv z
   jmp @2
@3 mov rv #0
@2 ret rv

nothing to do, already perfect
---------------------------

Summary

Discover 6 rules:

rule 1:  #0, jf -> nil
  jc to mov v #0, jf @1 v -> jc @1
rule 2:  #0, jf -> jmp
  mov v #0, jf @1 v -> jmp @1
rule 3:  jc to jmp -> jc
  jc to jmp @1 -> jc @1
rule 4: #1, jf -> nil
  mov v #1, jmp to jf v, $ -> jmp @$
rule 5: #1, jf -> jmp
  mov v #1, jmp to jf @1 v -> jmp @1
rule 6:
  mov v y, jmp to jc @1 v, $ -> jc @1 y, jmp @$

others in combination with ret called "diffuse ret"

rule 7:
  jmp to ret -> ret
rule 8:
  mov v x, ret v -> ret x
 
rule 8 should be used after rule 7, or rule 8 should be the last round of improv.  rule 2,4,5,6 are not totally safe as "mov v .." changes v that may be used later.  However, they are fairly safe if that v is rv.

15 Oct 2009