Low Power Programming

Internet of Things are a kind of embedded systems that connected through internet.  Hence, it is advisable to study how to program embedded systems in order to program IoT devices.  Most of IoT  devices will be mobile and therefore low-power consumption is a requirement.  In this lecture I will present approach to lower the power consumption of embedded systems through programming.

An embedded system connected to a physical world.  This connection can be implemented as a port where the signal can be read from the external world into the processor.  One simple way to do it is called "polling" where the processor keeps reading the port:

while(1)
   a = readport()
   process(a)

The frequency of reading the external signal is as fast as the time to execute one time around the loop.  Now let us consider how much the power is consumed by running this program for one second.  Let us assume that the processor will consume the power one unit when it executes one instruction, let us call this unit e.  Let the processor has the ability to run one million instructions per second (1 MIPS).  This entailed that the clock speed of this processor is probably 1 MHz if the processor can execute one instruction per clock (processor cycle time 1 micro-second, us).  This processor is far cry from the modern processor running in a typical laptop machine (2 GHz processor clock, 2016 Intel or ARM core).  So, running through the above program for one second will consume 1 Me. If one time around the loop takes 100 instructions to do processing and other overhead for looping, then  the port will be read every 10 milli-second, ms., in other words with the rate 10 KHz.  If we need to process the external signal with that rate, we run full speed of the processor, it will consume 1 Me. 

Instead of do "polling", there is another mechanism to allow processor to read the port periodically, the interrupt.  An interrupt is an external event that causes a processor to suspend its current program and jumps to a special program called "interrupt service routine" (ISR).  Once the interrupt service routine is complete, it jumps back to continue the suspended program.  We can issue this external interrupt periodically, usually using the peripheral called "counter and timer devices" and use ISR to perform reading and processing the signal from the port.

//  interrupt service routine
interrupt()
   a = readport()
   process(a)

main()
  while(1)
      // do nothing

The main loop now do nothing, the processor runs an empty loop and waits for interrupt to occur.  All the hard work is done in the ISR.  This style of programming is called "interrupting".  Let us see how much energy this program takes.  Let the interrupt occurs every 10 ms (rate 10KHz), so that the rate of reading the port of this program is the same as the "polling" program. Because it performs the same amount of work as "polling" program, it consumes similar amount of energy, 1 Me.  If we reduce the rate, for example 10 times, to 1KHz, or interrupt every 100 ms.  The amount of energy consume will not reduce because the processor will run the same amount of number of instructions but it is shifted more to the main "empty loop".  To reduce the energy, we need to avoid wasting the processor cycle. 

The processor has one special instruction, "sleep and wait for interrupt" (doze), once executed, the processor will be put in the sleep state and not consuming much energy (assume zero energy).  When an interrupt occurs, the processor will wake up, jumps to ISR, when ISR is completed, it will continue from where it wakes up.  Now we insert this instruction into the main loop:

//  interrupt service routine
interrupt()
   a = readport()
   process(a)

main()
  while(1)
      doze()    // sleep and wait for interrupt

How much energy this new program consume?  Assume interrupt every 100ms (rate 1KHz), the ISR runs 1,000 times in one second, each time executed 100 instructions, therefore total instructions executed is 100,000.  The main loop now runs until doze() then stop, so it may takes 2-3 instructions to go around the loop.  This main loop will be repeated the same amount of time as the interrupt occurs.  So, this amount of instruction overhead in the main loop is negligible. The total energy consumed is therefore 100 Ke.  So, this program consumes 10 times less than the polling program.  It has not wasting any processor cycle. 

Practice session

An embedded system with interrupts is presented here:

S21 with interrupt and IoT board specification

embed system

The main module contains S2.1 processor with memory.  The board has four interrupts (int0, int1, int2, int3) and four input ports (10, 11, 12, 13).  I made a set of tools for practicing with this kind of programming.  The tool consists of the compiler, the assembler and the simulator of the processor (S2.1 processor).  The language I used can be found here  (Rz language).  The interrupt program looks like this:

// simple interrupt
// a simple loop run for 10000 inst.
// when interrupt occurs, it prints out cnt


cnt        // global counter

int0()
  print(cnt," ")
  cnt = cnt + 1

main()
  cnt = 0
  while( 1 )   // wait for int
    doze()     // sleep and wait for interrupt

This program has an empty main loop. It waits for interrupt.  The interrupt service routine increments a counter and print it out.  The simulation runs for 1 second, or 1,000,000 time unit at the speed of the processor 1MHz.  The sequence of commands to run the experiment is, compile the program (simple-int.txt), assemble the assembly file, run the simulator.  The output of compiler will be an assembly language file  (simple-int-s.txt).  This file is assembled into an executable object file (simple-int-s.obj) suitable for the processor simulator. 

C:\iot-rz\test>rz simple-int.txt > simple-int-s.txt
C:\iot-rz\test>asm simple-int-s.txt
C:\iot-rz\test>sim simple-int-s.obj

load program, last address 37
int 201
<0> int 402
<1> int 603
<2> int 804
. . .
<94> int 19296
<95> int 19497
<96> int 19698
<97> int 19899
<98> stop, clock 20000, execute 1897 instructions

C:\iot-rz\test>

Currently, the simulator is set to run for 10,000 time unit.  If you comment out the "sleep and wait for interrupt", the program will run its full cycle.  Observe the amount of instruction executed.   The length of simulation can be change by changing the constant  MAXSIM  in the header file of the simulator  "s21.h".  You will need to recompile the simulator.

Multiple Interrupt

There are four interrupts, int0, int1, int2, int3.  Only one ISR is allowed to run at a time and it runs to completion.  No nested interrupt is allowed.  (This assumption simplifies the design of the system greatly) . When two interrupts occur simultaneously only the higher priority is acknowledge, all other interrupt have to wait until the first interrupt is returned.  The priority is int0 > int1 > int2 > int3.  An interrupt can be "enable" or "disable", this masks CPU to allow/ignore the interrupt. 

Interrupt can be enable and disable by
ei(n)   // n  0,1,2,3       equivalent to mov r1 #n, trap r1 #16
di(n)   // n  0,1,2,3       equivalent to mov r1 #n, trap r1 #15


int0 is bound to timer0 (explained below)
int1 is bound to timer1 (explained below)
int2 is bound to a random event, range of its interval is 100..500  uniformly distribute.
int3  is not yet used.

Timer

There are two timers, when the timer is set off, an interrupt occurs.  Timer0 is bound to int0.  Timer1 is bound to int1.  The interval of interrupt can be set by
settimer0(k)        // equivalent to  mov r1 #k, trap r1 #13
settimer1(k)        // equivalent to  mov r1 #k, trap r1 #14


Port

There are four ports available to use in the simulator
port 10  is an analog input port.  The simulator is wired to a sine wave, with the period 1000, amplitude 50.
port 11  is a "digital" input port.  It is a square wave (hence "digital") with the period 1000, amplitude 5.
port 12  is a 31-bit positive random number
port 13  is tie to the global clock.
These ports can be read by
x = readport(m)   // m 10,11,12,13

Example

This example shows how to read sine wave port and print the value out.  Using int0.

int0()
  x = readport(10)
  print(x)

main()
  while(1)
     doze()      // sleep and wait for interrupt


In this example, the default timer0 is set to 100. The sine wave table has the period 1000. The default simulation runs for 10,000 cycles.  Hence, there will be 100 interrupts. You will see ten sine waves.

The next example shows how to change the timer0 and use random number.  It also give a trick to do modulo arithmetic.  The port 12 (random number) generates a positive 31-bit integer using a very good random number generator, Mersenne Twister.

int0()
  x = readport(12)
  y = x - ((x / 256) * 256)  // modulo 256
  print(y)

main()
  settimer0(200)            // set timer0 to 200
  while(1)
     doze()   

Wave table

For analog (and also digital) port, the simulator creates "wave" table of a fixed period  (1,000).  You can visualise that the wave occurs in "real" time, that is, one sampling point per one instruction cycle.  There is a multiplier to change this period (default to 1).  The multiplier will "lenghtening" the period of the wave.  It can be changed  (also the amplitude) in the simulator header file "io.h".  The "digital" wave table can be created to be used as the input data for your application. (you need to modify "gendig()" in "io.c"). You need to recompile the simulator to change it. The analog port has the sine wave of period 1000 and amplitude +-50.  The default digital wave is a square wave of amplitude (0,5).

Example of low power programming

How to use interrupt service routine to run state-based programs

When we want to run a program under ISR, we need to design it in such a way that it "run-and-stop" and later it must come back to continue where it left.  This means it needs to use "state", that is, a persistent value to keep some value that will be used in the next "run-and-stop".  The following examples arise from the students' projects done in this class.

Example 1

Turn on a water pump for 10 minutes then turn it off.

Normal program

turn-on-pump
t = 0
while( t < 10 )
  delay(1 sec)
  t = t + 1
turn-off-pump

Interrupt driven

t       // global

int()
  t = t + 1
  if( t == 10 )
     turn-off-pump
     disableint()

main()
  settimer(1 sec)
  turn-on-pump
  t = 0
  while(1)
     doze()

The interrupt interval is set to one second.  The ISR will be invoke every one second.  When ISR counts to 10, it turn off the water pump and disable any further interrupt (or if you want it to repeat, you can reset t = 0).

Example 2

Frequency counter.  We measure the frequency of a signal (sine wave) by observing its "zero crossing".  Let say it changes from + to -, the first time, we record this time (t0).  Then, it changes from - to +, we record the second time (t1).  2 * (t1 - t0) is the period of this signal.  The frequency is 1/period . 

Normal program

i = 0

while( i < 2 )
  x = readwave()
  if x > 0 then
     while(1)
        x = readwave()
        if x < 0 then       // zero cross from + to -
           t0 = readclock()
           break
  else if x < 0 then
     while(1)
        x = readwave()
        if x > 0 then
           t1 = readclock()
           break
  i = i + 1          // we now have two zero cross

if t1 > t0 then
  dt = t1 - 10
else
  dt = t0 - t1
f = 1 / (2*dt)       // compute the frequency

Interrupt driven

We should not use "loop" (while 1) in an ISR.  So we break down the zero cross detection into states.  This is the state table.

state input  next-state  action
 
 1    +      2    
 2    +      2          it is + + ... stay on 2
 2    -      3          zero cross + to -
 3    -      3          it is - - ... stay on 3
 3    +      4          zero cross - to +

 1    -      5          start - 
 5    -      5          it is - - ... stay on 5
 5    +      2          goto start of +


We record zero crossing time at 2 to 3, and 3 to 4.  Here is the program

state          // global
t0, t1         // global

int()
  x = readwave()
  if state == 1
     if( x > 0 ) then state = 2
     else state = 5      
  else if state == 2
     if( x > 0 ) then state = 2
     else
         state = 3
         t0 = readclock()    // record time of zero cross + to -
  else if state == 3
     if( x < 0 ) then state = 3
     else
         state = 4
         t1 = readclock()    // record time of zero cross - to +
  else if state == 5
     if( x < 0 ) then state = 5
     else state = 2

main()
  state = 1
  while(1)
    doze()
    if state == 4 then
       dt = t1 - t0
       f = 1 / (2*dt)         // compute the frequency
       state = 1              // reset for next cycle

The interrupt interval comes as the rate of sampling the input signal. The ISR is just a state machine that run once each interrupt. The beauty of this program is that, no matter how the signal start (+ or -) it will correctly recognise that and when it reaches the terminal state (state 4) the correct record of period will be already acquired.  The main program just observe this state and compute the frequency then reset the state = 1 to continue the next cycle.

Homework

Play with  the interrupt interval using settimer0(d). It determines how often the interrupt occurs.  The number d is the number to unit time of interrupt interval. 

RZ 3.6  with IoT board

sim21  trap functions

0  stop
1  print int
2  print char
3  print string (array of char, terminate by 0)
4  input returns address of string (default at 15000)
13  settimer0   (iot board)
14  settimer1   (iot board)
15  disable int
16  enable int
17  sleep and wait for int
18  readport (iot board)
19  malloc 

Rz 3.6 has the following reserve words to use IoT board

asm(s)             // insert assembly string into asm source
print()            // print int and constant string "..."
printc(c)          // print char
prints(s)          // print string
settimer0(t)  
settimer1(t)
di(n)              //  diable int n
ei(n)              //  enable int n
doze()             //  sleep wait int
x = readport(k)    // read iot-board ports
x = input()        // input returns string
x = malloc(16)     // allocate 16 words from heap

S21 has special instructions to help writing complex interrupt (such as task switching in operating systems)

pushm r1        ;  push multiple registers  R[0]..R[15] to stack, r1 is stack pointer
popm r1         ;  pop multiple registers  R[0]..R[15] from stack, r1 is stack pointer
xch r1           ;  exchange return address of interrupt (RetAds register) with r1

Code Release (source + executable)

Rz compiler, S21 assembler, IoT board simulator  

iot-rz-6.zip  (with read-input-string)
iot-rz-13.zip  (with pragma #noframe)

last update 14 Feb 2017 (my valentine)