S-code virtual machine


The main loop is simply a decode-execute cycle using a "switch( )".  Let IP be the current instruction pointer.

Eval:
  while(runflag )
    opcode = CS[IP] & 255
    data = CS[IP] >> 8
    IP++
    switch(opcode)
      case Add: ...

The run-time data structure includes: a memory M[.], a stack-segment SS[.], a code segment is relocatable, a data segment is absolute.  The code and data segment are in M[.].

The memory map is as follows (from lo mem to hi mem):

system area
data segment
code segment

The system area contains some values used in communicating between system functions in the compiler and the virtual machine.  The stack segment is a separate data structure from the memory.  The stack segment contains a run-time data structure called "stack frame" used in a function call and the evaluation stack.  A pointer, FP, points to the current stack frame.  A pointer, SP, points to the evaluation stack. A stack grows from lo to hi address. The structure of a stack frame is as follows:

hi

         <-  SP
retads
FP'      <-  FP
lv_1
...
lv_n  

lo

The evaluation stack is "on-top" (higher address) of the stack frame.  To pass parameters from the current context to a function, the new stack frame is "overlapped" with the evaluation stack.  The new evaluation stack is then started at an address "after" the return address in the new stack frame. The current top-of-stack is at:

SS[SP]

To push a value to the evaluation stack requires:

SP++
SS[SP] = x

To pop a value from the evaluation stack is the reverse of push:

x = SS[SP]
SP--

To access a local variable in the current stack frame, a negative offset (the reference of a local variable) is used relative to FP.  For example the first local variable is at SS[FP-1], the second SS[FP-2] etc. It is not necessary to save SP as the "ret" instruction contains a proper offset to restore SP back to a previous context.  Similary, the "fun" instruction contains a proper offset to build a new stack frame.

Here is how a function call and return are implemented:

Let a be an offset to create a new frame, IP be the instruction pointer, ads be the address of the function.

Call:
1  SS[SP+a] = FP
2  FP = SP + a
3  SP = FP + 1
4  SS[SP] = IP + 1
5  IP = ads + 1

Line 1 saves the old FP at the new FP.  Line 2 moves FP to the new place.  Line 3 sets the new SP on top of the new frame.  Line 4 saves the return address. The current instruction pointer is at the caller, therefore the return address is IP + 1.  Line 5 jumps to the body of function.

The return instruction is divided into two cases: return with a value, return without any value.  To return with a value, the current top-of-stack value must be pushed to the previous evaluation stack. These two cases can be distinguished by checking whether at the time of return, SP comes back to its initial position or not (at the beginning of a new frame SP = FP + 1).

Let data be the offset in the return instruction.

Return-with-value:
1  IP = SS[FP+1]
2  a = TOS
3  SP = FP - data + 1
4  FP = SS[FP]
5  SS[SP] = a

Line 1 jumps to the return address. Line 2 saves the return value.  Line 3 restores SP.  Line 4 restores FP to the previous value hence delete the current stack frame and moves back to the previous one.  Line 5 pushes the return value to the current evaluation stack.

Return:
1  IP = SS[FP+1]
2  SP = FP - data
3  FP = SS[FP]

Return without a value is simply restoring the return address, the previous SP and FP.

[ to explain how to eval from Som language ]

last update 26 August 2007