Control Unit

Example control unit of S1

Hardwired S1

Figure State diagram of S1 control unit

Some improvement can be made to the above design.  To increase the speed the number of state for each instruction must be reduced.  To reduce the complexity of the circuit, state should be shared wherever possible.

Reduce the number of state
<store>
MAR = IR:ADS
MDR = R[IR:R0]
M[MAR] = MDR ; MWRITE
<storer>
MAR = R[IR:R2]
MDR = R[IR:R1]
M[MAR] = MDR; MWRITE
the above states cannot be merged as both MAR and MDR is on the same internal bus therefore can not be accessed at the same time.  If two internal bus are available then these states can be merged into one (the register bank already has two read ports).
<store>
MAR = IR:ADS; MDR = R[IR:R0]; MWRITE
<storer>
MAR = R[IR:R2]; MDR = R[IR:R1]; MWRITE

Share state
<load>
R[IR:R0] = MDR
<loadr>
R[IR:R2] = MDR
These states can be merged if R0 == R2.  We can do that by changing the opcode format to use fixed field encoding.  Moving R2 to the same field as R0.
<add>
R[IR:R2] = T
<inc>
R[IR:R1] = T
These states can be merged if R2 == R1.  We do that by changing the meaning of say "add" instruction from R1 + R2 -> R2 to R1 + R2 -> R1.

Timing of S1 hardwired control unit
load             6
store            6
loadr            6
storer           6
jump             5
jump taken       5
jump notaken     4
move             5
add              5
inc              5
cmp              4
call             9
ret              8

Microprogram control unit for S1

We use a single format microword.  The fields are as following :

Dest, Src : specify destination and source for internal bus.
SelR : selecting registers in register file.
Mctl :  memory control for read/write.
ALU : specify function of ALU and latch the result to T register.
Misc : other control signal such as PC + 1.
Cond : for testing condition for jump to other microword.
Goto : next address.

Table S1 microprogram
 
Loc Label Dest Src SelR ALU Mctl Misc Cond Goto
0 ifetch MAR PC
1 w0 RD MRDY w0
2 IR MDR PC+1 Decode
3 load MAR IR:ADS
4 w1 RD MRDY w1
5 R MDR IR:R0 U ifetch
6 store MAR IR:ADS
7 MDR R IR:R0
8 w2 WR MRDY w2
9 U ifetch
10 loadr MAR R IR:R1
11 w3 RD MRDY w3
12 R MDR IR:R2 U ifetch
13 storer MAR R IR:R2
14 MDR R IR:R1
15 w4 WR MRDY w4
16 U ifetch
17 mov IR:R12 PASS1
18 R T IR:R2 U ifetch
19 add IR:R12 ADD
20 T T IR:R1 U ifetch
21 cmp IR:R12 SUB U ifetch set CC
22 inc IR:R12 ADD1
23 R T IR:R1 U ifetch
24 jmp testCC ifetch cc false
25 PC IR:ADS U ifetch jump
26 jal R PC IR:R0
27 PC IR:ADS U ifetch
28 jr PC R IR:R1 U ifetch

The memory read/write step has  "wait for memory ready" state.  Because the use of cache memory, one can assume 0 clock waiting for memory ready when cache hits and more than 10 clocks for a miss penalty.

For example, instruction fetch starts with
            0: MAR = PC
Dest and Src of the internal bus MAR and PC, then wait for memory to fill in MDR.
            1: MDR = M[MAR] ; MREAD
After memory cycle has completed,
            2: IR = MDR ; PC = PC + 1
then branch to each instruction depends on IR:OP and IR:XOP (we will elaborate on this instruction decoding mechanism later).  Suppose the  instruction is "load", the microprogram go to location 2 (load) and the following sequence occurs :
            3: MAR = IR:ADS
then waiting for memory then
            4: MDR = M[MAR]  ; MREAD
            5: R[IR:R0] = MDR
The register is selected by IR:R0 and Dest and Src of internal bus are R and MDR. After completion, the microprogram branches back to instruction fetch (specified by the next address field).  For ALU instruction, for example, "add" the following sequence occurs after the instruction fetch, go to location 19 :
            19: T = ADD(R[IR:R1], R[IR:R2])
the registers are selected and read: IR:R1, IR:R2; to ALU and ALU function ADD is activated.  The result from ALU is latched to T register. Then the result is written to back to register selected by IR:R1 and the microprogram branches back to the instruction fetch.
            20: R[IR:R1] = T

Totally the microprogram is 29 words.  Each microword is in fact composed of the control bits that control the signals in the datapath.  We will assign the bits to each field of microword as following :
bit 0..4         Dest : 5 bits for write to R, PC, IR, MAR, MDR.
bit 5..10       Src : 6 bits for read from R, PC, IR, MAR, MDR, T.
bit 11..14     SelR : 4 bits for selecting IR:R0, IR:R1, IR:R2, IR:R1,R2
bit 15..18     ALU : 4 bits for ALU function : PASS1, ADD, SUB, ADD 1.
bit 19..20     Mclt :  Mread, Mwrite
bit 21           Misc : 1 bit for PC + 1.
bit 22..25     Cond : 4 bits for jump control : Uncond, Mrdy, testCC, Decode.
bit 26..30     Goto : 5 bits,  micro store has 29 addresses therefore 5 bits to address each of them.

So for the unencoded microword, the microword for S1 is 31 bit long.  The instruction decoding, to branch to each microprogram sequence for each instruction, can be achieved by using IR:OP concatenate with IR:XOP (3 bits and 4 bits) to point to a jump table which contain the location of microword in the microprogram.

To reduce the width of the microword, each field can be "encoded" as following :
Dest :     5 signals, 3 bits.
Src :        6 signals, 3 bits.
SelR :     4 signals, 3 bits (including NONE)
ALU :    4 signals, 3 bits.
Mctl :     2 bits
Misc :     1 bit.
Cond :    4 signals, 3 bits
Goto :     only 6 distinct locations to jump to : ifetch, w0, w1, w2, w3, w4 hence 3 bits.
Totally the encoded or vertical microprogram for S1 is 21 bit long.

Figure Scheme for decoding opcode in ifetch

Figure Comparing unencoded and encoded microword for S1

Timing for microprogrammed S1
load              6
store             7
loadr             6
storer            7
jump uncond       5
jump taken        5
jump nottaken     4
move              5
add               5
inc               5
cmp               4
jal               5
jr                4