S3.0 is a multicore processor with S2.1 instruction set. Each core
has its own "program" memory but all cores share "data" memory. So,
when fetching instructions, there is no memory access conflict (as each
core has its own program memory). Because data memory is shared, the
conflict access can occurs. However, initially, we assume no data
memory access conflict.
All cores have their own hard/soft interrupt. The interrupt vector of core
N is at M[1000+N].
When program multicore processors, we write assembly code for all cores in
one file. So, it is
similar to write a single core program. A special instruction, "cid
r1
" (core id), returns the core number. This instruction is used
to differentiate an individual core so that we can assign different cores
to execute different part of code. Here is an example:
:main
cid r3 ;; core number in r27
eq r2 r3 #0
jt r2 program1
jmp program2
:program1 ;; this is run
by core 0
....
:program2 ;; this is run
by core 1
To make multicore work collaboratively we need semaphore (similar to
single core) and the new all core "sync" instruction.
How to block the current process. A processor needs to stop itself,
then how to "resume"? When a processor stops, it can only be wake up
by the external signal, such as interrupt.
A new instruction "wait for interrupt (wfi)" behaves like it interrupt
itself and goes into sleep state.
R[31] = next PC, stop execution
When interrupt occurs it behaves like it returns from an interrupt and
continues execution.
PC = R[31], continue execution
With this instruction we can implement semaphore as follows:
wait(sem)
M[sem]-- decrement count
if( M[sem] < 0 )
enqueue current process
block current process (wfi)
Block the current process is achieved by "wfi"
signal(sem)
if( M[sem] < 1 ) M[sem]++ increment count
if( M[sem] <= 0 )
p = dequeue()
send interrupt to p (intx p)
send interrupt to p, is a new instruction "intx r1" where p =
R[r1] is the core number 0.. NC-1 (NC number of core).
What will happen if "intx r1" is send to core that is not in "wfi"?
For a consistent meaning, core p should behave as if it is
interrupted. However, to simplify it, we can just ignore core p if
it is not in "wfi". We have to be careful not to send "intx" to the
core that is already stop (not "wfi") as it will continue to execute the
unknown instruction.
We assume all-process synchronisation. Each process execute "sync" instruction and send itself into sleep. When all cores execute "sync", every core will be wake up to continue. If only some process need to be sync, the mechanism to do it will be more complicate. we need to know which process and how many of them want to be sync. For the core that is idle, we can execute "sync" so that it will participate properly with other active cores when they "sync". We can also sync a pair of cores using semaphore. Here is an example:
p1()
p2()
i =
0
i = 10
while i <
5
while i < 20
print
i
print i
i++
i++
sync
sync
stop
stop
p1 runs loop to print 0..4. p2 runs loop to print 10...19. p1 reaches the end faster than p2 but it waits there. p2 catches up at "sync", then both cores proceed to stop.
The implementation of sync uses "runflag[core]" to control the start/stop of cores. Here is the pseudo code (NC number of core, current core is k):
sync()
runflag[k] = 0
// stop this core
for(a=0, i=0; i<NC; i++) // check all
cores
if(runflag[i] == 0) a++
if(a == NC)
// all cores have stop
for(i=0; i<NC; i++)
runflag[i] = 1
// start all cores
The simulator has the following commands:
a - show all cores
g - go
t - single step
b ads - set breakpoint
c n - focus core n
s [rn,mn,pc] v - set
d ads n - dump
r - show register
q - quit
h - help
C:\s30\test>sim30 count.obj
load program, last address 23
>c 0
>g
<1> [11] <2> [12] <3> [13] <4> [14] <5> [15]
stop, execute 40 instructions
>c 1
>g
[16] [17] [18] stop, execute 63 instructions
>
C:\s30\test>sim30 count.obj
load program, last address 23
>a
>t
core 0: PC 0 trap 5
r1:0 r2:0 r3:0 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
core 1: PC 0 trap 5
r1:0 r2:0 r3:0 r4:0 r5:0 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:1 r28:0 r29:0
>t
. . .
>t
core 0: PC 10 eq r6 r5 #5
r1:1 r2:1 r3:0 r4:0 r5:1 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:0 r28:0 r29:0
core 1: PC 18 mv r1 r5
r1:11 r2:0 r3:0 r4:0 r5:11 r6:0 r7:0 r8:0 r9:0
r20:0 r21:0 r22:0 r23:0 r24:0 r25:0 r26:0 r27:1 r28:0 r29:0
>
There are new instructions to support collaboration of multicore. All of them must be real instructions, they can not be written as sequence of other instructions as they control start/stop of cores.
cid r1
wfi
intx r1
sync
ei, di, pushm, popm, cid,
wfi, intx, sync.
Some instructions have been renamed:
pushm is savr, popm is resr.
Some instructions now is not
needed: savt, rest
(as R[31] can be moved explicitly with mv
).
Some OS supports are still trap functions: newsem, wait, signal
.
Finally, interrupt works for all cores.