v5 compiler code optimisation
The following rules are used in v5 compiler:
1. jmp to ret -> ret
2. jc to jmp.y -> jc.y
macro and (and (and x y) z)
3. jc to mov v #0, jf.y v
-> jc.y
4. jc to mov v #0, jt.y v,
$ -> jc.$
macro or
5. mov v #1, jmp to jf.y,$
-> jmp.$
6. mov v x, jmp to jf.y,$
-> jf.y x, jmp.$
7. lop v, jmp to jf.y v, $
-> jinvlop.y, jmp.$
These rules proved to be crucial in making "queen2" benchmark as fast
as it is now. A general observation that lead to these rules are
discussed below.
Observing the code generator performing expansion of macro and/or, one
can notice that it leaves a lot of room for code optimisation.
macro and
and (and x y) z
jf @1 x
<1> -> jf @3 rule 1
mov rv y
<2> -> jf @3 y, jmp @$ rule 6
jmp @2
@1 mov rv #0 <3> -> jmp
@3 rule 2
@2 jf @3 rv
@$ mov rv z
jmp @4
@3 mov rv #0
@4 ret rv
rule 1: #0, jf -> nil
jc to mov v #0, jf @1 v
-> jc @1
rule 2: #0, jf -> jmp
mov v #0, jf @1 v ->
jmp @1
------------------------------
and x (and y z)
jf @1 x
jf @2 y
mov rv z
jmp
@3 <1> -> jmp @4 rule 3
@2 mov rv #0
@3 jmp @4
@1 mov rv #0
@4 ret rv
rule 3: jc to jmp -> jc
jc to jmp @1 -> jc @1
-----------------------------
and (or x y) z
jf @1 x
mov rv #1 <1>
-> jmp @$ rule 4
jmp @2
@1 mov rv y
@2 jf @3 rv
@$ mov rv z
jmp @4
@3 mov rv #0
@4 ret rv
rule 4: #1, jf -> nil
mov v #1, jmp to jf v, $
-> jmp @$
-----------------------------
and x (or y z)
jf @1 x
jf @2 y
mov rv #1
jmp
@3 <1> -> jmp @4 rule 3
@2 mov rv z
@3 jmp @4
@1 mov rv #0
@4 ret rv
similar to (and x (and y z))
----------------------------
macro or
or (or x y) z
jf @1 x
mov rv #1 <1>
-> jmp @$ rule 5
jmp @2
@1 mov rv y
@2 jf @3 rv
@$ mov rv #1
jmp @4
@3 mov rv z
@4 ret rv
rule 5: #1, jf -> jmp
mov v #1, jmp to jf v,$
-> jmp @$
----------------------------
or x (or y z)
jf @1 x
mov rv #1 <1>
-> jmp @$ rule 5
jmp @2
@1 mov rv y
@2 jf @3 rv
@$ mov rv #1
jmp @4
@3 mov rv z
@4 ret rv
----------------------------
or (and x y) z
jf @1
x <1> -> jf @3 rule 1
mov rv y
<2> -> jf @3 y, jmp @$ rule 6
jmp @2
@1 mov rv #0
@2 jf @3 rv
@$ mov rv #1
jmp @4
@3 mov rv z
@4 ret rv
rule 6:
mov v y, jmp to jc @1 v, $
-> jc @1 y, jmp @$
---------------------------
or x (and y z)
jf @1 x
mov rv #1
jmp @2
@1 jf @3 y
mov rv z
jmp @2
@3 mov rv #0
@2 ret rv
nothing to do, already perfect
---------------------------
Summary
Discover 6 rules:
rule
1: #0, jf -> nil
jc to mov v #0, jf @1 v
-> jc @1
rule 2: #0, jf -> jmp
mov v #0, jf @1 v ->
jmp @1
rule 3: jc to jmp -> jc
jc to jmp @1 -> jc @1
rule 4: #1, jf -> nil
mov v #1, jmp to jf v, $
-> jmp @$
rule 5: #1, jf -> jmp
mov v #1, jmp to jf @1 v
-> jmp @1
rule 6:
mov v y, jmp to jc @1 v, $
-> jc @1 y, jmp @$
others in combination with ret called "diffuse ret"
rule
7:
jmp to ret -> ret
rule 8:
mov v x, ret v -> ret x
rule 8 should be used after rule 7, or rule 8 should be the last round
of improv. rule 2,4,5,6 are not totally safe as "mov v .."
changes v that may be used later. However, they are fairly safe
if that v is rv.
15 Oct 2009