S3.0 is a multicore processor with S2.1 instruction set. Each core
has its own "program" memory but all cores share "data" memory. So,
when fetching instructions, there is no memory access conflict (as each
core has its own program memory). Because data memory is shared, the
conflict access can occurs. However, initially, we assume no data
memory access conflict.
All cores have their own hard/soft interrupt. The interrupt vector of core
N (N = 0..maxcore) of interrupt K (K = 0..3) is at M[1000 + 4*N + K].
When program multicore processors, we write assembly code for all cores in
one file. So, it is similar
to write a single core program. A special instruction, "cid r1
"
(core id), returns the core number. This instruction is used to
differentiate an individual core so that we can assign different cores to
execute different part of code. Here is an example:
:main
cid r3 ;
core number in r3
eq r2 r3 #0
jt r2 program1
jmp program2
:program1 ; this is run
by core 0
....
:program2 ; this is run
by core 1
To make multicore work collaboratively we need semaphore (similar to
single core) and the new all core "sync" instruction. All core sync
will synchronise every core. Two instructions help synchronise pair
of core, waitx/sigx (imitate semaphore).
How to block the current process. A processor needs to stop itself,
then "wake up" by another process. When a processor stops, it can
only be wake up by the external signal. A new instruction "waitx"
puts the core into a wait state. It can be waken up when another core
sends "intercore signal" by the instruction "sigx #c" to it.
Here is an example how to synchronise two cores. Count1 runs on core 0 and Count1 runs on core 1. Each loop of Count1 waits for external signal. The signal comes from Count2 "sigx #0". So, loop of Count1 is synchronise with loop of Count2 (each prints its number alternatingly). We put in some delay in Count2 to make sure that Count1 reaches "waitx" before Count2 executes "sigx #0".
.code 0
cid r3
eq r2 r3 #0
jt r2 count1
jmp count2
:count1 ;
count 1..5
mov r1 #0
st r1 cnt1
:loop1
waitx
; ****
ld r5 cnt1
add r5 r5 #1
st r5 cnt1
trap r5 #print ; increment cnt1 and print
eq r6 r5 #5
jf r6 loop1
trap r0 #stop
:count2 ; count
11..20
nop
nop
nop
nop
; delay
mov r1 #10
st r1 cnt2
:loop2
sigx #0
; ****
ld r5 cnt2
add r5 r5 #1
st r5 cnt2
mov r2 #43 ; +
trap r2 #printc
trap r5 #print ; increment cnt2 and
print
eq r6 r5 #20
jf r6 loop2
trap r0 #stop
.end
If you run the above program, the output looks like this:
C:\s30\test>sim30 -2 count-sigx.obj
2 cores, load program, last address 30
>g
1 +11 2 +12 3 +13 4 +14 5 +15 core 0 stop, clock 54
+16 +17 +18 +19 +20 core 1 stop, clock 101
>
What will happen if "sigx" is send to core that is not in "waitx"? The signal will be just ignored. We have to be careful not to send "sigx" to the core that is already stop as it will continue to execute the unknown instruction.
To perform all-core synchronisation, each core executes "sync" instruction and send itself into wait state. When all cores execute "sync", every core will be wake up to continue. If only some core need to be sync, it is better to use "waitx/sigx". Here is an example:
p1()
p2()
i =
0
i = 10
while i <
5
while i < 20
print
i
print i
i++
i++
sync
sync
stop
stop
p1 runs loop to print 0..4. p2 runs loop to print 10...19. p1 reaches the end faster than p2 but it waits there. p2 catches up at "sync", then both cores proceed to stop.
The implementation of sync uses "runflag[core]" to control the start/stop of cores. Here is the pseudo code (NC number of core, the current core is k):
sync()
runflag[k] = 0
// stop this core
for(a=0, i=0; i<NC; i++) // check all
cores
if(runflag[i] == 0) a++
if(a == NC)
// all cores have stop
for(i=0; i<NC; i++)
runflag[i] = 1
// start all cores
The simulator has the following commands:
a - show all cores
g - go
t - single step
b ads - set breakpoint
c n - focus core n
s [rn,mn,pc] v - set
d ads n - dump
r - show register
q - quit
h - help
C:\s30\test>as30 count.txt
C:\s30\test>sim30 -2 count.obj
2 cores, load program, last address 24
>c 0
>g
1 +11 2 +12 3 4 +13 5 +core 0 stop, clock 36
14
>c 1
>g
+15 +16 +17 +18 +19 +20 core 1 stop, clock 87
>
C:\Users\prabhas\Dropbox\s30\test>sim30 -2 count.obj
2 cores, load program, last address 24
>a
mode show all cores
>t
T 1
core 0: PC 0 cid r3
r1:0 r2:0 r3:0 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
core 1: PC 0 cid r3
r1:0 r2:0 r3:1 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
>t
...
>t
T 3
core 0: PC 2 jt r2 4
r1:0 r2:1 r3:0 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
core 1: PC 2 jt r2 4
r1:0 r2:0 r3:1 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
>t
T 4
core 0: PC 4 mov r1 #0
r1:0 r2:1 r3:0 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
core 1: PC 3 jmp 13
r1:0 r2:0 r3:1 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
>t
...
There are new instructions to support collaboration of multicore. All of them must be real instructions, they can not be written as sequence of other instructions as they control start/stop of cores.
cid r1
waitx
sigx #c
sync
ei, di, pushm, popm, cid,
waitx, sigx, sync.
Finally, interrupt works for all cores.