experiment with multiplication function // using shift and double add // d is a * 2^n, accumulate is s // s partial sum, d double to mul3 a b | aa bb s c d = // if a == 0 | b == 0 // 0 break // if a < 0 aa = 0 - a else aa = a // if b < 0 bb = 0 - b else bb = b // if aa < bb // c = a // a = b // b = c // swap larger to a s = 0 d = a if b < 0 c = 0-b else c = b // c must be positive while c > 0 if c & 1 s = s + d d = d + d c = c >> 1 if b < 0 s = 0-s // conserve sign s print mul3 (0-300) 20 microprogram activated <-6000> bye St 5 >> 5 + 7 - 1 & 5 < 2 <= 6 Ret 2 Retv 1 Halt 1 Get 36 Put 15 Sys 1 Lit 27 Jmp 1 Jf 13 Call 3 total 131 instruction 1763 cycles read CS 277 times DS read 103 write 92 total 195 using larger swap is slower <-6000> bye St 5 >> 5 + 7 - 2 & 5 | 1 = 2 < 5 <= 6 Ret 2 Retv 1 Halt 1 Get 44 Put 17 Sys 1 Lit 32 Jmp 2 Jf 17 Call 3 total 158 instruction 2119 cycles read CS 334 times DS read 124 write 107 total 231 // constants used to mask bit //cc = 32768 << 16 // 0x10000000 //cd = cc ^ (0-1) // b1 = 1 // booth algorithm // using r to build up result instead of // shifting a into q. this eliminate cc and cd to mul m q | a a0 q0 qm i r b1 = b1 = 1 r = 0 a = 0 qm = 0 for i 1 16 // for 16-bit smc // printc 33 print q nl q0 = q & 1 if q0 & (!qm) a = a - m if (!q0) & qm a = a + m a0 = a & 1 a = a >> 1 qm = q0 q = q >> 1 // if a0 q = q | cc else q = q & cd if a0 r = r | b1 b1 = b1 << 1 // q r print mul (-300) 20 Booth's algorithm is not really faster in practice (due to complexity in writing low level bit manipulation) microprogram activated <-6000> bye St 4 ! 32 >> 32 << 16 ++ 16 + 2 - 3 & 64 | 6 <= 17 Ret 2 Retv 1 Halt 1 Get 248 Put 128 Sys 1 Lit 49 Jmp 1 Jt 17 Jf 48 Call 3 total 691 instruction 8680 cycles read CS 1310 times DS read 549 write 438 total 987 mul3 131 inst 1764 clk mul 691 inst 8680 clk of course, the total number of instruction does not varied much with the arguments in booth's algorithm as it always shifts a,q 16 bits. the number of iteration in mul3() is ln b which the largest is also 16 times (16-bit). the stat shown is for multiplicant 20 which is 5 bits. 1 June 2004 P. Chongstitvatana