S2 version 3  Assembly language

Meta commands

There are three sections which must be present in this  sequence: define symbols, code section, data section.  Each section starts with a meta command: .symbol, .code, .data. The end of assembly file denotes by .end.

.symbol
...
.code 0
...
.data 100
...
.end

Comment

The ';' starts the comment to the end of the current line.  Comments are not interpreted by the assembler.

Symbol definition

In symbol definition section, symbols are defined with their associated values. 

.symbol
  print 1
  base  100
  ...

Code section

Code section is the main assembly instruction. Labels can be defined in code section and they can be referred to by assembly instructions. Labels are used for destination of jump or as symbolic names in data section. A label is prefix with ":". An assembly instruction is written as shown in the instruction set definition. 

Addressing mode

For each instruction the operands are written in such a way to simplify the assembler using prefix to identify the addressing mode.

ld r1 @10 r2      displacement  M[10+R[r2]] 
ld r1 +r2 r3      index   M[R[r2]+R[r3]]
st r1 @2 r2
st r1 +r2 r3
add r1 r2 r3
add r1 r2 #20     constant value is prefix by #


The assembler does not check for all possible illegal combination of opcode, addressing mode and operands.  Here is an example
.code   0
        mov r1 #0
        mov r2 #1
:loop   le r3 r2 #10     ;  while i <= 10
        jf r3 exit
        add r1 r1 r2     ;    s = s + i
        add r2 r2 #1     ;    i++
        jmp loop
:exit   trap r0 #0

Data section

The data section defined constant values.

.data 100
  11 22 33 44 0

S2 instruction format

s2 version 2  has three instruction formats. This version (3) simplified the format using flat encoding (no extended op).  It limits the range of address a bit but makes it easier to understand instruction encoding.

    op:6 r1:5 r2:5 d:16

where d can store {r3,ads,disp}
disp is sign extended to 32-bit.

The assembler works in one pass (read source only once) using two phases:
  phase1
    input scanning, collect symbols
  phase2
    resolve forward references
    generate object code from the token list

Symbol table

There are a number of predefined symbols such as opcode mnenomics (add...), register names (r0..r31).
The output file is suitable for a loader of the simulator

ads, len,         ; code segment
op, a1, a2, a3,   ; each instruction
...
ads, len,         ; data segment
num,              ; each 32-bit data
num,
0                 ; end of object


5 Jan 2013