ISA Instruction Set Architecture of several processors IAS 1000 storage locations of 40 bits number sign and 39-bit value instruction format, two instructions in one word op:8 address:12 op:8 address:12 (40 bits) micro-architecture ALU,MQ,AC,MBR control unit,IBR,PC,IR,MAR main memory, i/o intermediate result in AC:MQ (80 bits) Instruction set 21 instructions data transfer LOAD MQ AC = MQ LOAD MQ,M(X) MQ = M(X) (memory) STOR M(X) X (memory) = MQ LOAD M(X) AC = M(X) LOAD -M(X) AC = -M(X) LOAD |M(X)| AC = |M(X)| LOAD -|M(X)| AC = -|M(X)| branch JUMP M(X,0:19) goto lefthalf of M(X) JUMP M(X,20:39) goto righthalf of M(X) branch condition JUMP+M(X,0:19) if AC negative JUMP+M(X,20:39) if AC is nonnegative arithmetic ADD M(X) AC = AC + M(X) ADD |M(X)| SUB M(X) SUB |M(X)| MUL M(X) AC:MQ = M(X) x MQ DIV M(X) MQ = AC / M(X) rem in AC LSH shift left one bit RSH shift right one bit address modify STOR M(X,8:19) left ads at M(X) = AC right 12-bit STOR M(X,28:39) right ads M(X) = AC right 12-bit micro-steps if next instruction is in IBR then MAR = PC MBR = M(MAR) if left instruction required then IBR = MBR(20:39) IR = MBR(0:7) MAR = MBR(8:19) else IR = MBR(20:27) MAR = MBR(28:39) PC = PC + 1 else IR = IBR(0:7) MAR = IBR(8:19) PC = PC + 1 MBR = M(MAR) AC = MBR PC = MAR if AC >= 0 then PC = MAR MBR = M(MAR) AC = AC + MBR ... Multiplication takes 39 steps, one for each bit except the sign bit. PDP 8 fixed 12-bit instruction mem ref 7 bits plus 1 bit modifier mem pages 2^7 = 128 words (page 0 or current) one accumulator op:3 i:1 z:1 disp:7 mem ref op:3 device:5 opcode:3 i/o instruction op:12 reg ref i indirect address z page 0 instruction set (8 instructions) AND ac = ac and opr TAD two's complement add (13-bit) ISZ increment and skip if zero DCA deposit ac in mem and clear ac JMS jump to subroutine JMP jump IOT i/o transfer OPR microcoded op. microinstruction OPR (can be combined) extendes code (prefix 7) (L is one bit prefixed to AC) CLA clear ac CLL clear link CMA complement ac CML complement link RAR rotate L:ac right RAL rotate L:ac left RTR rotate L:ac right twice RTL rotate L:ac left twice BSW byte swap IAC increment L:ac SMA skip on minus ac SZA skip on zero ac SNL skip on nonzero link SKP skip unconditional SPA skip on ac >= 0 SNA skip on ac != 0 SZL skip on L = 0 CLA clear ac OSR or switch with ac HLT halt MQA multiplier quotient into ac MQL multiplier quotient load PDP 10 fixed 36-bit instruction 16 general purpose registers op:9 reg:4 i:1 rx:4 ads:18 PDP 11 16-bit instruction 8 general purpose registers (..., sp, pc) 13 instruction formats op:4 src:6 des:6 op:7 r:3 src:6 op:8 offset:8 op:8 fp:2 des:6 op:10 des:6 op:12 cc:4 op:13 r:3 op:16 op:4 src:6 des:6, ads:16 op:7 r:3 src:6, ads:16 op:8 fp:2 src:6, ads:16 op:10 des:6, ads:16 op:4 src:6 des:6, ads1:16, ads2:16 src/des 3-bit address mode, 3-bit reg num. VAX 1-2 byte opcode, 0-6 operand modifiers, results in 1-37 bytes long instructions. IBM S/360 instruction format op:8 r1:4 r2:4 RR format op:8 r1:4 x2:4 b2:4 d2:12 RX format op:8 r1:4 r3:4 b2:4 d2:12 RS format op:8 i2:8 b1:4 d1:12 SI format op:8 l1:4 l2:4 b1:4 d1:12 b2:4 d2:12 SS format IBM S/390 L load mem to reg LH load half word LR load register-register LER load float-reg to float-reg LE load mem to float-reg LDR load float-reg to float-reg (long) LD load mem to float-reg (long) ST store reg to mem STH store halfword STC store character STE store float-reg to mem STD store float-reg to mem (long) add, sub, mul, div AND, OR, XOR, NOT, EQUAL TR R1,R2,L translate BR X branch BRP X branch to X if positive BRN X negative BRZ X zero BRO X overflow BRE R1,R2,X branch to X if R1 = R2 ISZ R1 increment and skip next inst. if zero call ? Pentium instruction format 0/1 inst.prefix, 0/1 seg.override, 0/1 operand size ovr, 0/1 address size ovr 0-4 inst.prefix, 1/2 opcode, 0/1 mod-r/m, 0/1 sib, 0-4 disp, 0-4 imm mod-r/m = mod:2 reg/op:3 r/m:3 sib = scale:2 index:3 base:3 instruction prefix LOCK, REP, REPE, REPZ, REPNE, REPNZ (use CX) addressing mode immediate register displacement EA = (SR) + A base EA = (SR) + (B) base with displacement EA = (SR) + (B) + A scaled index with disp EA = (SR) + (I)x S + A base with idx and disp EA = (SR) + (B) + (I) + A base scaled idx disp EA = (SR) + (I)x S + (B)+A relative EA = (PC) + A EA effective address (X) content of X SR segment reg PC program counter A address field B base reg I index reg S scaling factor data movement MOV reg-reg reg-mem PUSH push to stack PUSHA push all registers MOVSX mov with sign extension LEA load effective address XLAT table lookup IN,OUT arithmetic ADD SUB MUL IDIV logical AND BTS bit test and set BSF bit scan forward, no. of first 1-bit SHL/SHR SAL/SAR ROL/ROR SETcc set a byte to 0/1 on cond. flags control JMP CALL JE/JZ LOOPE/LOOPZ dec ECX and test for jump INT/INTO interrupt (overflow) string MOVS move byte (index by ESI, EDI) LODS load string Hll support ENTER create a stack frame LEAVE reverse the action of ENTER BOUND check array bound flags STC set carry flag LAHF load A reg fro flags segment reg LDS load pointer to D segment reg. HLT LOCK hold shared memory ESC processor extension escape, for coprocessor WAIT wait until BUSY# negated (pin) protection SGDT store global descriptor table LSL load segment limit VERR/VERW verify segment for read/write cache management INDV flushes internal cache WBINVD flushes internal cache after write INVLPG invalidate a TLB entry PowerPC load/store instruction format 32-bit instruction (with extension) op:6 imm:22 A:1 L:1 branch op:6 opt:5 cr:5 disp:14 A:1 L:1 branch op:6 opt:5 cr:5 indir:15 L:1 branch op:6 r1:5 rb:5 disp:16 ld/st op:6 r1:5 rb:5 rx:5 s:10 u:1 ld/st op:6 r1:5 r2:5 disp:16 ld/st op:6 r1:5 r2:5 r3:5 xop:11 alu op:6 r1:5 r2:5 im:16 alu L link, ea placed in link reg A absolute/relative-PC opt conditional branch: count zero, true, false cr test bit in condition reg addressing mode load/store indirect EA = (B) + D ind. indexed EA = (B) + (I) branch absolute EA = IM relative EA = (PC) + IM indirect EA = (L/CR) EA effective address (X) content of X B base reg I index reg L/CR link/count reg IM immediate value PC program counter branch b branch bl branch and link bc branch condition sc system call trap compare two operands and invoke system trap load/store lwzu load word and zero extend, update source reg ld load double lmw load multiple word (through R31) lswx load string to reg arithmetic add subf mullw multiply low-order 32-bit (result 64-bit) divd logical cmp compare and set cond. reg. crand cond. reg. and and cntlzd count no. of consecutive 0 rldic rotate left double word and with mask sld shift left float lfs load 32-bit float-reg from mem fadd float add fmadd float multiply and add fcmpu float compare cache management dcbf data cache block flush icbi instruction cache block invalidate MIPS R4000 32 64-bit registers 128Kbytes cache (64K inst., 64K data) instruction format op:6 rs:5 rt:5 imm:16 immediate op:6 ads:26 jump op:6 rs:5 rt:5 rd:5 shift:5 xop:6 reg rs source reg rt source/destination reg imm immediate ads jump address rd destination reg shift amount of shift instruction set load/store LB load byte LBU load byte unsigned LH load halfword LHU load halfword unsigned LW load word LWL load word left LWR load word right SB store byte SH SW SWL SWR arithmetic ADDI add imm ADDIU add imm unsigned SLTI set less than imm SLTIU set less than imm unsigned ANDI and imm ORI XORI LUI load upper imm ADD add ADDU add unsigned SUB SUBU SLT set less than SLTU AND OR XOR NOR SLL shift left logical SRL SRA shift right arithmetic SLLV shift left logical variable SRLV SRAV MULT multiply MULTU mulitply unsigned DIV DIVU MFHI move from HI MTHI move to HI MFLO move from LO MTLO move to LO J jump JAL jump and link JR jump to reg JALR BEQ branch on equal BNE BLEZ BGTZ BGEZ BLTZAL branch on less than zero and link BGEZAL special SYSCALL BREAK SPARC register window, each 24 registers (2-32 windows) r0-r7 global registers (share by all procedures) each process sees r0-r31, r24-r31 are "ins", r16-r23 are "locals", r8-r15 are "outs". instruction format op:2 disp:30 call op:2 a:1 cond:4 op2:3 disp:22 op:2 des:5 op2:3 imm:22 op:2 des:5 op3:6 src1:5 0:1 u:8 src2:5 op:2 des:5 op3:6 src1:5 1:1 im:13 addressing mode EA = (R_S1) + S2 EA = (R_S1) + (R_S2) synthesis mode EA = A <= R_0 + S2 direct EA = R <= R_S1, R_S2 register EA = (R) <= R_S2 + 0 load/store (reg indir) EA = (R)+A <= R_S + S2 load/store instruction set load/store LDSB load signed byte LDSH load signed halfword LDUB LDUH LD load word LDD load doubleword STB STH STD STDD arithmetic SLL shift left logical SRL SRA shift right arithmetic AND and ANDCC and, set cc ANDN nand ANDNCC nand, set cc OR ORCC ORN exclusive nor ORNCC XOR XORCC XNOR XNORCC ADD add ADDCC add set cc ADDX add with carry ADDXCC SUB SUBCC SUBX SUBXCC MULSCC multiply step, set cc branch BCC branch on condition FBCC br on float cond. CBCC br on coprocessor CALL call JUMPL jump and link TCC trap on condition SAVE advance reg window RESTORE move reg window backward RETT return from trap misc SETHI set high 22 bits UNIMP unimplement (trap) RD read a special reg WR write a special reg IFLUSH inst. cache flush Itanium 128 64-bit registers 128 82-bit float/graphic reg. 64 1-bit predicate reg. instruction format 41-bit instruction op:4 mod:10 r3:7 r2:7 r1:7 pr:6 instruction bundle (128 bits) inst2:41 inst1:41 inst0:41 template:5 template maps instX to functional units ARM 32-bit data path 16 32-bit registers all instructions have conditional instruction format cond:4 op:8 rn:4 rd:4 rot:4 imm:8 cond:4 op:8 rn:4 rd:4 shfim:5 shf:2 0:1 rm:4 cond:4 op:8 rn:4 rd:4 imm:12 cond:4 op:4 ads:24 conditional EQ, NE equal, not equal CS/HS, CC/LO carry set, higher or same, MI, PL minus, plus VS, VC overflow, nooverflow HI, LS higher, lower or same GE, LT, GT, LE signed greater than ... AL, NV always, never addressing mode shifter operand 3 formats: 1) immediate rot:4 imm:8 rotate imm with even no. bit (0..30) 2) register 3) shifted register shf:2 rm:4 reg operand is shifted: ASR, LSL, LSR, ROR, RRX ex: add r9 r5 r5 lsl #3 immediate offset 12-bit offset register offset rn +/- rm scaled reg ofs rn +/- rm shf #shfim imm pre-indexed rn +/- 12-bit offset reg pre-indexed rn +/- rm scaled reg pre-idx rn +/- rm shf #shfim imm post-idx rn +/- 12-bit offset reg post-idx rn +/- rm scaled reg post-idx rn +/- rm shf #shfim instruction set load/store LDR load word LDRB load byte LDRBT load byte with user mode LDRT load word with user mode LDRH load unsigned halfword LDRSB load signed byte LDRSH load signed halfword LDM load multiple (reg) STR store word STRB store byte STRBT store byte with user mode STRT store word with user mode STRH store halfword STM store multiple (reg) B branch BL branch and link BX branch and exchange MOV move (shift) MVN move not ADD add ADC add with carry SUB SBC RSB reverse subtract RSC AND EOR ORR XOR BIC bit clear CMP compare CMN compare negated TST test TEQ test equivalence MLA multiply accumulate MUL multiply SMLAL signed mul acc long SMULL UMLSL unsigned mul acc long UMULL Thumb 32-bit data path 8 32-bit registers 16-bit instructions Thumb state is entered by ARM BX instruction. Thumb allows compact code. instruction format op:5 im:5 rm:3 rd:3 op:7 rm:3 rn:3 rd:3 op:7 im:3 rn:3 rd:3 op:5 r:3 im:8 op:10 r1:3 r2:3 op:4 cond:4 ads:8 op:5 ads:11 addressing mode immediate offset register ofs pc-relative sp-relative instruction set load/store LDR load word LDRB load unsigned byte LDRH load unsigned halfword LDM load multiple STR store word STRB store unsigned byte STRH store unsigned halfword STM store multiple POP pop multiple PUSH push multiple B branch (cond) BL branch and link BX branch and exchange MOV move MVN move not ADD add ADC add with carry SUB SBC NEG negate AND EOR ORR BIC bit clear CMP compare CMN compare negated TST test MUL multiply LSL logical shift left LSR logical shift right ASR arithmetic shift right ROR rotate right References Flynn, M., Johnson, J. and Wakefield, S., "On instruction sets and their formats", IEEE trans. on computers, March, 1985. Mirapuri, S, Woodacre, M. and Vasseghi, N., "MIPS R4000 processor", IEEE Micro, April, 1992. Colwell, R, Hitchcock, C., Jensen, E., Sprunt, H. , and Kollar, C., "Computer, Complexity and Controversy", Computer, September, 1985. Colwell, R, Hitchcock, C., Jensen, E., and Sprunt, H., "More controversy about: Computer, Complexity and Controversy", Computer, December, 1985. (RISC vs CISC) Jaggar, D., (ed), Advanced RISC Machine architectural reference manual, Prentice hall, 1996. Hill, M., Jouppi, N., Sohi, G., Readings in Computer Architecture, Morgan Kaufman Pub., 2000. Jones, D., PDP-8 Frequently Asked Questions, http://www.faqs.org/faqs/dec-faq/pdp8/, 2001. last update 27 September 2003 P. Chongstitvatana