2110714 Digital Systems

1st semester Aug 2020
classroom online:  Thursday  13:00-16:00

Aim

The class is divided into three parts. The first one discusses fundamental of performance of the modern processor. The second part discusses OS concept of concurrency. The third part discusses Graphic Processing Units.

Announcement

  . . .

Assesment

20%   homework
50%   assigned projects
30%   final take home exam

Lecture

part 1

Performance  Chapter 1
Memory Hierarchy  Chapter 2
    Cache basics (CS, Washinton U)   cache-memory
    Virtual memory
Instruction level parallelism  Chapter 3   
    Pipeline
Data parallelism  Chapter 4
    lecture on data parallelism

part 2
          S2 processor instruction and assembly language (pptx)
          S2 processor and instruction set (web)      
          assembly language programming
          interrupt programming (IoT board simulator)        
          basic OS concepts
          Multitask-OS (MOS) (in assembly)  mos2.txt  (in Rz) mos-rz.txt
          lecture from Matt Welsh, Harvard,   semaphores.pdf
          semaphore:  demo from William Stallings, U. of Queensland
          implementing semaphores  example reader-write  mos-rz2.txt

part 3

          Development of Graphics Processing Unit
Programming of Graphics Processing Unit    NPU webpage
General Purpose GPU applications
programming with CUDA (by Tongjai Yampaka)
link to video: Introduction to programming CUDA

Next week

3 Sep 2020  Discuss cache design.
10 Sept 2020  Discuss cache optimization, virtual memory.  Talk about Hot ships trend. Demo logic work.
17 Sept 2020  Discuss virtual memory, how TLB works. Simple logic design (with Logicwork)
12 Nov 2020  Study how to communicate between processes, implement semaphore

. . .

Projects

1)    Gate level implementing a very simple processor  4B 
2)    Experiment with cache design

Cache design
     Perform experiment on Cache design with the given trace.  cache-simulator (C program)
      Large trace:   
             This is a trace of a compiler, Som v. 4.1, compiles itself. See this page for the detail about the compiler
              http://www.cp.eng.chula.ac.th/~piak/project/som/index.htm
              The trace file is trx.zip (5 Mbytes). When unzip it is ~100Mbytes.
              each line contains an address xxx {r,w} r is read w is write
              data is at 0..371192
               instruction is at 400000..410873
               total number of instruction executed is 4520058
               total number of data access is 1044647.

3)  Use logicWork to implement a clock.  Input is 50 Hz signal.  The output is 7-segment display. Output has 4 digits, 2 digits for hour, 2 digits for second. One blinking display for 1 sec.  Please realise the counter as synchronous (all flip-flop has the same clock) because it will lessen any timing error. 

. . .

Tools

Logicwork 5   User-manual
Tools for interrupt programming    iot-rz-19.zip  (for Windows)   xtest.zip  (for Mac, executable only)
Tools for assembly language programming    s21-3.zip
NPU assembler and simulator tools  npsim4-1.zip
Cache design:  cache-simulator (C program), address trace trx.zip 

Homework

27 Aug 2020   Explore performance estimation of Microsoft Surface Duo
3 Sep 2020      Hand-on with 4B simulator
9 Sep 2020      Build a full adder in Logicwork. Experiment with Cache simulation
5 Nov 2020     Use iot-rz-19  with MOS version mos-rz.txt, create three free running processes.

...

Additional information

Reference textbook:  Computer Architecture: A Quantitative Approach 5th Edition by John L. Hennessy, David A. Patterson, Morgan Kaufmann, 2011. For the latest edition of this book  link to Amazon

Transistor level simulation of an antique CPU.  6502 is used in the iconic Apple II machine.

    http://visual6502.org/JSSim/index.html

Latest trend in Processor technology from Intel  (Hot ships 2020)

last update 5 Nov 2020