S2 version 3  Assembly language

Meta commands

There are three sections which must be present in this  sequence: define symbols, code section, data section.  Each section starts with a meta command: .symbol, .code, .data. The end of assembly file denotes by .end.

.symbol
...
.code 0
...
.data 100
...
.end

Comment

The ';' starts the comment to the end of the current line.  Comments are not interpreted by the assembler.

Symbol definition

In symbol definition section, symbols are defined with their associated values. 

.symbol
  print 1
  base  100
  ...

Code section

Code section is the main assembly instruction. Labels can be defined in code section and they can be referred to by assembly instructions. Labels are used for destination of jump or as symbolic names in data section. A label is prefix with ":". An assembly instruction is written as shown in the instruction set definition. 

Addressing mode

For each instruction the operands are written in such a way to simplify the assembler using prefix to identify the addressing mode.

ld r1 @10 r2      displacement  M[10+R[r2]] 
ld r1 +r2 r3      index   M[R[r2]+R[r3]]
st r1 @2 r2
st r1 +r2 r3
add r1 r2 r3
add r1 r2 #20     constant value is prefix by #


The assembler does not check for all possible illegal combination of opcode, addressing mode and operands.  Here is an example
.code   0
        mov r1 #0
        mov r2 #1
:loop   le r3 r2 #10     ;  while i <= 10
        jf r3 exit
        add r1 r1 r2     ;    s = s + i
        add r2 r2 #1     ;    i++
        jmp loop
:exit   trap r0 #0

Data section

The data section defined constant values.

.data 100
  11 22 33 44 0

S2 instruction format

s2 version 2  has three instruction formats. This version (3) simplified the format using flat encoding (no extended op).  It limits the range of address a bit but makes it easier to understand instruction encoding.

    op:6 r1:5 r2:5 d:16

where d can store {r3,ads,disp}
disp is sign extended to 32-bit.

The assembler works in one pass (read source only once) using two phases:
  phase1
    input scanning, collect symbols
  phase2
    resolve forward references
    generate object code from the token list

Symbol table

There are a number of predefined symbols such as opcode mnenomics (add...), register names (r0..r31). The assembler generates an output file as a Javascript file to be loaded by the S23 simulator written in Javascript.
The output file is suitable for a loader of the simulator.  Here is the format

var = obj = [
ads, len,         ; code segment
op, a1, a2, a3,   ; each instruction
...
ads, len,         ; data segment
num,              ; each 32-bit data
num,
0                 ; end of object
];


Here is a real example. Assemble this program (add10-s.txt) will give the output (to file add10-s.obj).
; a simple example of assembly language programming
;    add one to ten
.symbol
    print 1
.code   0
    mov r1 #0
    mov r2 #1
:loop  
    le r3 r2 #10     ;  while i <= 10
    jf r3 exit
    add r1 r1 r2     ;      s = s + i
    add r2 r2 #1     ;      i = i + 1
    jmp loop
:exit  
    trap r1 #print   ;  print s
    trap r0 #0
.data 100
    0                ;  dummy
.end
c:> as23 add10-s.txt

in the file add10-s.obj
var obj =[  // add10-s.obj
0, 9,
15, 1, 0, 0,
15, 2, 0, 1,
20, 3, 2, 10,
9, 3, 0, 7,
32, 1, 1, 2,
10, 2, 2, 1,
9, 0, 0, 2,
51, 1, 0, 1,
51, 0, 0, 0,
100, 1,
0,
0];

last update  13 Jan 2013