Stack machine

What is a stack machine?
     Contrast to an ordinary processor of contemporary design which use registers a stack machine
uses stack.   Stack is a LIFO (last in first out) storage with two abstract operations : push, pop.
Push will put an item into stack at the top.  Pop retrieve an item at the top of stack.

Calculations using stack.
    Because a stack is LIFO, any operation must access data item from the top.  Stack doesn't need
'addressing' as it is implicit in the operators which use stack.  Any expression can be transforms
into a postfix order and stack can be use to evaluate that expression without the need for explicitly
locate any variable.  For example,
  B + C - D  ==>
        B C + D -  (post fix)
        push val B, push val C, add, push val D, sub.
  A = B  ==>
        A B =
        push ads A, push val B, store.
Where add, takes top two items from stack add them and push the result back to stack, similarly
sub operators.  Store takes one value and one address from stack and store value to address.

  Let's compare the above expression to the calculation using registers.
  B + C - D
    load r0, B
    load r1, C
    add r0, r1   ;;  r0+r1 -> r0
    load r2, D
    sub r0, r2

  A = B
    load r0, ads A
    load r1, val B
    store r1, (r0)   ;; r1 --> (r0)

    One can see that the main difference is that registers must be  allocated, for example, r0 is used
to store temporary result while in stack machine the temporary storage is implicit.

Calling subroutines
    The stack structure also plays an important role in the calling of subroutines (or function calls).
When a program transfers the control to another section of program, the current state of
computation is saved (composed of Program Counter, local variables, and other values).  The
place where the state of computation is saved is called 'activation record'.  When an execution of a
subroutine is completed the previous state of computation is restored (this action is called 'return
from subroutines').  Because many of today programming languages are 'structured' or 'block
oriented' the creation and deletion of activation records behave like LIFO. Stack is used to store
activation records.

  f(A)  call   g(B,C)
  Act Rec:       fp',sp',pc', A, ...
                 fp, sp, pc , B, C,...  <-- TOP

  where fp is frame pointer, sp is stack pointer,   pc is current program pointer, pc' is previous pc
  A is local variable of f.  B,C are local variables of g.   frame pointer link activation records
together.  Stack pointer points to current stack, use for storing temporary value of current calculation.

Parameters passing by stack

ret
old frame  <-  frame pointer
x1
x2         <- stack pointer

ret        :: frame 1
fp0 <--    <- fp1
x1     |
x2     |
ret    |   :: frame 2
fp1 ---    <- fp2
y1
y2         <- stack pointer

Stack of stack machines
    From the above observation, stack machines use stack to store temporary value during
calculation and also activation records during transfer of control to subroutines.  Whereas in
register machines register must be allocated explicitly to store temporary values and explicit LIFO
manipulation must be done (via some kind of pointer) to handle activation records.

Current languages that use stack
    There are a number of contemporary programming languages that use stack abstraction, for
example,  Forth, Postscript.  Also many languages use 'virtual machine' model to implement their
executable representations.  Pascal has P-system, Smalltalk uses stack for calculation, JAVA has
byte-code that use stack model.  My own work on concurrent real-time language R1, uses stack
machine as 'virtual machine'.

Example of stack ISA
  Let we ignore local variables (therefore reduce the complication of an activation record) and
illustrate an ISA that is based on stack machine.  We need load, store, arithmetic operators,
call, return and for flow of control we use conventional jump and branch.

notation:  TOP is the item on top of stack, NEXT is an item below
TOP (therefore we can talk about 2 operands on stack by TOP, NEXT)
M[ads] value of memory at ads.  pop a  is TOP --> a, pop2 pop two
items off stack.

  lit #a  ;;  push the immediate value a.
  load    ;;  pop a, push M[a].
  store   ;;  NEXT -> M[TOP], pop2.
  add     ;;  NEXT + TOP -> a, pop2, push a.
  cmp     ;;  if NEXT > TOP a = 1 else a = 0, pop2, push a.
  call    ;;  pop a, create new activation record, goto a.
  return  ;;  delete current activation record, go back pc'.

Please notice that except lit #a which has #a as argument, all other instructions have argument(s)
implicit in the stack.  The state of computation consisted of a stack pointer and a program counter.
If  we have two stacks one for computation and one for activation record (called control stack)  we
will need only to store the program counter (return address) in the activation record and there is
no need to do anything to computation stack on subroutine calls.  Calling a subroutine will need
just push the current program counter (return address) onto the control stack.  Returning is just
pop the control stack and restore the previous program counter.

Pure stack machine
    If we don't have local variables how can operands in the computation stack are accessed?  With
exception for the top two items all other items are difficult to get to.  We need to be able to reorder
and copy a number of items in the stack in order to use them.  The following instructions did that :
  dup  ;; duplicate TOP
  swap ;; swap TOP and NEXT
  rot  ;;  1,2,3 ->  3,1,2  get the third item to the top
  over ;; copy NEXT to TOP ,  1,2 ->  2,1,2

These are just some of the possible instructions.  An  example of their use:
  f(X,Y) = X*X + Y*Y
  X and Y are the top two items on computation stack when call f().
  dup mul     ;;  X*X
  swap        ;;  Y, X*X
  dup mul add ;;  Y*Y + X*X

Thinking about rearranging items on stack make it difficult to use pure stack instructions.  Having
variables avoids the reordering of items on stack because a variable can be accessed by using
its name.  However, current compilation techniques can handle the ordering of stack items
therefore it frees a programmer from this low level detail.

Example of program : bubble sort
This example shows how a high level language program can be translated into ISA of a stack
machine.

Given a[n] an array of integer, the bubble sort program sorts the items in a[] in ascending order.
Initially, i=0, j=0.

  while( i < n ) {
    while( j < n ) {
     if( a[j] > a[j+1] ) {
       t = a[j];
       a[j] = a[j+1];
       a[j+1] = t;
     }
     j = j+1;
    }
    i = i+1;
  }

Data segment (word)
1: t
2: i
3: j
4: n
5: a[0]
6: ...

Code segment (byte)
   68:rval 3 rval 4 lt
   75:jz 187
   78:rval 2 rval 4 lt
   88:lval 5 rval 2 index load
      lval 1 rval 2 lit 1 add index load gt
  109:jz 159
  112:lval 1 lval 5 rval 2 index load store
  124:lval 5 rval 2 index
      lval 5 rval 2 lit 1 add index load store
  144:lval 5 rval 2 lit 1 add index rval 1 store
  159:lval 2 rval 2 lit 1 add store
  170:jmp 78
  173:lval 3 rval 3 lit 1 add store
  184:jmp 68
  187:

where
  lval ads is lit #ads
  rval ads   is lit #ads load
  index  is add
  jz    is jump if TOP = 0
  jmp    is unconditional jump

Discussion
    Stack machines are arguable almost the simplest kind architecture.   Its LIFO structure is quite
suitable for block-oriented language.  The code size for a stack machine can be very compact
because most instructions have no operand field.  Stack architecture used to be very popular
method to implement high level language machine.  Most of modern register machines are faster
but there is some 'renewal' effort to improve stack architecture.   Notably, Sun's Picojava processor which aims to execute JAVA virtual machine byte-code.  Stack architecture may prove to be
suitable for the machine in the future.