Low power design

(summary from S. Devadas, S. Malik, A survey of optimization techniques targeting low power VLSI circuits, in Proc. of the 32nd ACM/IEEE conference on Design automation conference, 1995 , San Francisco, California, United States, pp. 242 - 247)  (p242-devadas.pdf)

Looking at electronic level (transistors and the like), the power consumption of a circuit can be calculated as follows:

P = 1/2 C V_dd^2 f N + Q_sc V_dd f N + I_leak V_dd

P        total power
V_dd     supply voltage
f         frequency

first term -- the power required to charge and discharge circuit nodes.  C  node capacitances, N swithching activity, the number of gate output transitions per clock cycle.

second term -- power dissicpation during the output transitions due to current flowing from the supply to ground, short-circuit current. Q_sc the quantity of charge carried by the short-circuit current per transition.

third term -- power disscipation due to leakage current I_leak.  Reverse bias current in diodes.

the switching activity 90% of power dissipation

We will look at the techniques to reduce power consumption at 4 levels:

  1. Circuit level
  2. Logic gate level
  3. Architecture level
  4. System and software level

Circuit level

logic gates, transistor in combinational circuits

ordering of gate inputs will affect power and delay, the late arriving signals should be placed closer to the output to minimize gate propagation delay.  The average power dissipated is dependent on the transition probabilities of the gate inputs and the internal node capacitances.

transistor sizing, larger reduces delay but increases power dissipation.  Slowing the gate down without affecting the critical delay path.

Logic level

combinational  (logic synthesis).  Two phases :
1 logic equations are manipulate to reduce area, delay, power.
2 equations are mapped to a particular technology library, optimising for area, delay, power.

Phase 1

exploiting don't care
the power dissipation of a gate is dependent on the probability of the gate evaluating to a 1 or a 0.

path balancing
spurious transition accounts for 10 to 40% of typical switching activitiy in combinational circuits.  Reducing spurious switching activity by adjusting (balancing) the delay of paths.

factorization
ac + ad + bc + bd = (a+b) (c+d)
common subexpression elimination
kernel extraction (msx reduce switching activity)

Phase 2

graph covering (select library components: gates with different transistor sizes)

Sequential
state transition level
logic gate/flip-flop level

encoding
state encoding for minimal area (and switching activity)
reduce switching activity on datapath, add a line to tell if the value is true of complement (upon received).  0000 to 1011 but send 0100 instead.  Arithmetic with other base (residue encoding)

retiming
reposition flip-flops to minimize the required clock period.
switching acitivity at flip-flop outputs can be significantly less than the acitivity at the flip-flop inputs. (spurious transitions at the inputs are filtered out by the clock).  retiming can exploited this fact.

gated clocks
the inaction (of particular units) can be determined, then power reduction can be obtained by gating the clocks of the units.
power-down alu.

precompute
diable the circuit that does not contribute to the computation result.

C > D  comparator
the MSB can be xnor  to disable other bits
c = 1 d = 0  the result is 1 (C>D) regardless of other bits
c = 0 d = 1  the result is 0 (C<D)
saving depends on the prob of xnor = 0

Architecture level

analysis of the power consumption of a design is used to minimize power in a synthesis system.
power models: execution units, control units, memory, interconnect.
characterising the estimated capacitance that would switch + the estimate of the activity for each module.

behavioural synthesis
mapping a high-level spec. (data-flow/control-flow graph) into a register-transfer level design

reduce the number of control steps. slower clock can be used for the same throughput.  quadratic reduction in power consumption.
reduce resouces will reduce capacitance that switches.

memory
1  memory access consume power (especially if off-chip)
2  larger memory, greater capacitance
reduce memory access: loop reordering

System and software level

major functionality is in the form of instruction as opposed to gates.  accurate power analysis tools exist only at gate level.
impossible to analyse power over large programs.
measure the current of CPU during execution of a program.
develop instruction level power model for specific CPU.
software power optimisation
faster code almost always implies lower energy.
register allocation affect power consumed (register operands are much cheaper than memory operands)
order of instruction (internal switching in the cpu)

Furthur references

A good source of literature in low power design is the International Symposium on Low Power Electronics and
Design  (ISLPED), ACM Press, it is an annual event.