assembly level programming



Why


1  hardware is one limiting factor of performance of a computer system
2  in a resource bounded system (amount of memory, power source), such as a mobile device, an assembly level programming is necessary to tune the performance.

Simple data path


    ---> Reg ---> alu --->
    |                    |
    |                    |
    <---------------------

           Control unit

Most programmable device has this simple data path:

Reg  -- registers store the data
alu  -- perform calculation
control  --  orchestrate the signal according some "program" to achieve the desire sequence of computation.

You can regard registers as storing the variables in the program.  The alu does usual arithmetic and logic such as, add, sub, and etc.  The program of the data path is a sequence of "instruction" that activate the data path.  Example of instructions are:  

read registers
select alu functions
write registers

example:  adding two numbers  

to perform   A = B + C
Let A,B,C be the registers, the program will be:

read B and C (to alu)
alu: add
write output of alu to A

All complex instructions are built similar to this simple sequence.

The instruction set of a processor is an abstraction of this "micro-cycle" in the control unit.  This example above can become an instruction in the assembly level "add r1 r2 r3" with the meaning r1 = r2 + r3.  

Assembly language


We will study the assembly level programming by relating low level language to high level language.  A high level language has data type plus control flow such as if..then, while, for which will be translated into a low level language.  All high level abtraction such as data type, are gone when the program is transformed into a low level language.

The instruction in an assembly language can be grouped into 3 types:
1  arithmetic group:  add, sub, and ...
2  control group:  jump, jump conditional, jump and link, jump back...
3  data group:  load, store to/from main memory
4  other that support I/O and operating systems:  trap...

Format of instructions

We choose our hypothetical instruction set to be a 3-address instruction (the instruction has 3 operands).  It is suitable for a modern high level language.  The formats are:

op  r1  r2  r3
jmp cond  ads
trap num  r1
ld  r1   ads
st  ads  r1

with the convention:  dest <-  source  such as r1 <- r2 op r3

We will start with a simplest program, an assignment.

1  a = b + c
2  a = b + c - d

Rule of thumb for assembly programming

1  determine and write down the mapping between variables and registers.  We assume all variables are mapped to registers.  In this assumption, there are enough registers to be used.
2  sometime, you will need a temporary register to store an intermediate value.

Assignment

example 1  

a = b + c

let a = r1, b = r2, c = r3

p1:   add r1 r2 r3


example 2  

a = b + c - d

one intermediate value is needed.  The program becomes

t = b + c
a = t - d

This is because the instruction set has only 3 operands.  In a compiler vocabulary, this is called a "quad", that is, 3 operands plus one operator.  For example, "t = b + c"  is a quad "+ t b c".  

let a = r1, b = r2, c = r3, d = r4, t = r5

p2:   add r5 r2 r3
      sub r1 r5 r4

Control flow

let c, a, b be expressions  A high level language:

if c  then a else b

The strategy to translate this if..then..else is:

    c
    jmp-if-false L1
    a
    jmp exit
L1: b
exit: ...

example  3


   if a == 0
     then  b = 1
     else  b = 2

let a = r1, b = r2, t = r3

We need a new addressing mode with immediate value (literal), let it be:  
        op r1 r2 #imm    means   r1 = r2 op #imm
and an instruction to move an immediate value to a register
        or r1 r0 #imm    means   r1 = r0 or #imm
where r0 always has a literal zero.

The assembly of the above program is:

    sub r3 r1 #0       ;;  test a == 0
    jmp nz L1
    or r2 r0 #1        ;;  b = 1
    jmp exit
L1: or r2 r0 #2        ;;  b = 2
exit: ...

Note:  the if..then..else can be considered as a multiplexor:

    a -->
    b -->  MUX --> out
            |
            c

that the output is a/b depended on c.

7 Feb 2005
P. Chongstitvatana