Control Flow

topics

assembly <--> HLL

if..then..else
simple function call
moving value between registers
HLL function call
accessing variables (not yet written)

If..Then..Else

Strategy to translate if..then..else

if <cond> then <true-act> else <false-act>

to assembly

<cond> which affect flags
jmp cond-false false
:true
<true-act>
jmp always endif
:false
<false-act>
:endif

Example

a program segment:

a = 3
if a == 3 then b = 1
else b = 2

is translated into:

;; let a, b are global variables, let r1 = a, r2 = b
ld r1 #3
st a r1 ;; a = 3
;; translate condition
ld r1 a
sub r0 r1 #3 ;; a == 3
jmp ne false
;; true-act
ld r2 #1
st b r2
jmp always endif
:false
ld r2 #2
st b r2
:endif

Simple function call

To call a subroutine (a function is called a subroutine in assembly language), two aspects must be considered: flow of control, and passing parameters. Let's examine the flow of control first.

This is an example:

1: add1(x) { return x + 1; }
2: main() {
3:    a = 2;
4:    b = add1(a);
5:    c = ...
    }

add1() is a subroutine. It is called from main() at the line 4. The program jumps to line 1 execute it and returns to line 4 to complete it (assign the return value to b) and continue with line 5.

The program counter (pc) which keeps track of the current instruction to be executed, must be saved upon jumping to a subroutine and must be restored upon returning from a subroutine. See what happen when an instruction being fetched from memory and executed: (:: means it is the operation of the processor, called, micro-step, or sometimes called Register-Transfer-Language (RTL))

fetch an intruction from memory:

:: ir = Mem[pc]
// fetch an instruction from memory pointed by pc to
// the instruction register
:: pc = pc + 1
// advance the program counter to the next instruction

Now that the instruction sits in the instruction register, the control unit decodes it and executes it. Note that the program counter now points to the next instruction.

To call a subroutine the instruction is:

jal rx ads

this is called "jump and link" instruction. The action is as follows, pc is saved into the link register "rx" (x can be any register, 1..31) and then "jump" to ads which is the beginning of the subroutine.

:: rx = pc // saved pc (next instruction)
:: pc = ads // jump to ads

At the subroutine, once it is finished, to return to the "caller" (the program that called the subroutine, the subroutine itself is called "callee"), the instruction is:

jr rx

this is called "jump register". This instruction restores pc from rx which affect the processor to "jump" back to the caller.

:: pc = rx

Here is how the above example is translated:

;; let r4 = x, let r5 = retvalue, let r31 be link reg.
:add1
    add r4 r3 #1   // return x + 1
    jr r31
...
;; let a,b,c are global variables
;; let r1 = a, r2 = b, r3 = c, let r31 be link reg.
:main
    ld r1 #2
    st a r1
    or r4 r1 r0    ;; pass parameter x = a (1)
    jal r31 add1   ;; call add1(a)
    or r2 r5 r0    ;; b = retvalue
    ...

moving value between registers

(1) moving a value between registers can be done several ways using r0 (because r0 always is 0), for example: r1 <- r2

1) or r1 r2 r0
2) add r1 r2 r0
3) xor r1 r2 r0

(Can you think of other way?)

Passing parameters is done by arranging the value to be passed in register(s). This must be arranged so that all registers are not collided. In this simple call, we arrange registers so that caller and callee use different registers to prevent the value from corruption by the callee. The caller (main) used r1 r2 r3, the callee (add1) used r4, r5. The link register is common, r31.

A more complex call will required saving and restoring registers to a data structure called "stack", because it is not possible to arrange the disjoint set of registers between caller and callee. In fact, a subroutine can be called from any program segment, it is not possible to know what registers are being used by the caller. So, arranging the subroutine to use a different set of registers from the caller is almost impossible (a clever compiler can do it, however).

<the following discussion is a bit above our level>
The act of saving and restoring registers to and from stack can be done both by the caller or by the callee. The "caller save" is done by the caller saving registers that it used to make sure they are saved from the callee. If it knows what registers the callee used it can save only those that are in collision. The "callee save" is done by the callee saving registers that it itself used. There is no way to assume what registers that caller used because it does not know who is the caller at the compile time (well, a compiler may do this). There are numerous number of research in the past studying both methods, which method is better depends on the circumstances.

HLL function call

The above explanation of a calling sequence is simplified. The actual translation from HLL to assembly language is a bit more complicate by the fact that when the call is occured, it creates a data structure called "activation record" to support local variables storage. An activation record is a data structure and keeps all local variables and the computation state. AR is created when the call occurs and is destroyed when the call finished (returning to the caller). So, it has the behavior of Last-In-First-Out data structure. AR is a dynamic data structure, it changes during the run-time. To keep track of AR, one special pointer is used FP (frame pointer). FP points to the current frame which also stored the previous FP'. The computation state includes: FP, SP, pc.

(to be continue)

July 2003

stack data structure
to store link register when nested call or recursively.
saving the registers (locals of function)

Stack

push(x) :
add sp sp #1
st @0 sp x
x = pop() :
ld x @0 sp
sub sp sp #1

when multiple values are pushed into a stack, the compiler can use displacement to give an offset to the stack pointer. The stack pointer can be adjusted at the end of the sequence:

st @1 sp first
st @2 sp second
st @3 sp third
add sp sp #3

and similary for poping multiple values:

ld third @0 sp
ld second @-1 sp
ld first @-2 sp
sub sp sp #3

Please note that the offset of sp started from 1 when push and 0 when pop due to asymmetric of two operations in terms of the initial position of sp, and the order of operands are reversed.

Callee-save

The subroutine takes responsible in saving and restoring link register and all registers local to it in order not to interfere with values of the caller.

:subroutine
push link
push local1
....
do body of subroutine
...
pop localx
pop link
jr link

follwing this discipline, the recursive call can be achieved without any additional action.

.s
sp 5
link 4
a 100
.a 0
.c
:main
ld sp #200 ;; initialise sp
ld r2 #3
jal link fac
st a r3
trap stop r0
:fac
;; let t1 = n to do
;; n * fac n - 1 is
;; t1 = n
;; n = n - 1
;; retvalue = call fac
;; retvalue = t1 * retvalue
;; save locals : r6 = t1
st @1 sp link ;; push link
st @2 sp r6 ;; push t1
add sp sp #2
sub r0 r2 r0 ;; n <= 0 ?
jmp GT else
ld r3 #1
jmp always ret
:else
or r6 r2 r0 ;; t1 = n
sub r2 r2 #1 ;; n - 1
jal link fac
mul r3 r6 r3 ;; n * fac n - 1
:ret
;; restore
ld r6 @0 sp ;; pop t1
ld link @-1 sp ;; pop link
sub sp sp #2
jr link
.e

(To elaborate more on how an expression can be transformed into sequence of "quad" instructions.
A quad is "op r1 r2 r3". )

29 August 2003