2190250
Computer Architecture and Organization
2nd
semester Jan-May 2021
classroom
: Tuesday 9:00-12:00, online
Official
syllabus
COMP ARCH ORG
Computer evolution and performance; computer structure, function, and
interconnection; memory hierarchy; cache memory; virtual memory; storage;
input/output; operating system support; process; interrupt; system call;
instruction set; processor structure and function; pipelining;
super-scalar processors; multi-core computers.
Aim
Today
digital technology permeates every corner of our society. There are three
big trends that shape our future. The first one is the coming of
Artificial Intelligence. The second one is the rise of
Automation. The last one the revolution of technology to transfer
money. These advancements rely on the power of computing. This class
introduces an overview of modern computer system.
Topics
Computer
Technology
Instructions
Arithmetic
Processor
Memory
Graphic Processing Unit
Announcement
19 Jan 2021 start
of the class
1 March mock exam one and a half hour
8 March midterm exam one and a half hour (online)
4 May final exam, take home, due time midnight
. . .
Assessment
40%
in-class work, simple question at break, quiz at the end
of lecture
30%
midterm (1 hours) around first week of March
30%
final (2 hours)
Lecture
Computer Technology
Technology
Performance
Power
Basics
Instructions
S2 instructions
Arithmetic
Floating-point arithmetic
Logic design
Processor
Memory (no powerpoint)
- memory technology
- cache memory, direct-map cache
- cache performance
Multiprocessor (no powerpoint)
Graphic Processing Unit
Programming
Videos
Introduction: Technology https://youtu.be/_S8QKZsT3wk
Performance and Power https://youtu.be/tZJQ9XLl86Y
Instruction part 1: RISC-V https://youtu.be/k9Lkiz8GS88
Instruction part 2: data, binary number, branch, instruction
encoding https://youtu.be/Z7PeBG3hv6w
S2 instruction and use of assembly tools https://youtu.be/bsqhvtazRBk
Arithmetic: add, subtract, multiply, divide integers
https://youtu.be/8AT7Sty0r7E
Assembly language programming: expression, array, branch and
loop https://youtu.be/KXxD-0rKz4I
Floating-point arithmetic https://youtu.be/KBpgZHG-rn4
Relationship between high level language and assembly https://youtu.be/GZova062JHg
Processor design, basic logic https://youtu.be/zYQx4JJ8Jyg
Processor design, simple datapath of RISC-V https://youtu.be/iGoyB_0qXYE
Demonstrate logic simulation tool https://youtu.be/vdQOVpneLP8
Processor design, control unit https://youtu.be/X17kvp8s2P0
How to use S2 assembler and simulator on Mac (2 min video) https://youtu.be/u23t2QqS3_M
Challenge of performance design of modern processor: Recap https://youtu.be/r_LjZ1vD8kU
Pipeline part 1: pipeline organization and harzard https://youtu.be/myITS4tAjY8
Pipeline part 2:
pipeline control a) buffering between stages https://youtu.be/P87zCotSF1A
pipeline control b) control signals https://youtu.be/fUPwkXzkIPA
pipeline control c) (optional) register forwarding https://youtu.be/1A4POVBJhpo
Memory design part 1: cache memory https://youtu.be/uKioRhmliVM
Memory design part 2: memory technology, direct-map cache https://youtu.be/DvJ4CWOYbDk
Memory design part 3: cache performance (optional) https://youtu.be/2Q5ASgBwFZc
Multiprocessor and GPU https://youtu.be/25fJx-w_eOs
Intro to programming with CUDA (by Tongjai Yampaka) https://www.youtube.com/watch?v=FL240cnJIRM
Tools
Tools
for assembly language programming
S2 assembler and simulator s21-3.zip
Rz
compiler rz36-3.zip
Logic design tool: LogicWork5
tutorial
S2 assembler and simulator for Mac (written in Python) s2-for-mac.zip
how to use
Homework
26
Jan
1.1
)Write simple assembly program (RISC-V) to add 3 numbers, all numbers are
in registers
1.2)
Write simple assembly program (RISC-V) to add 2 numbers, they are in
memory location 100 and 101.
1.3)
Change the above two program into S2 assembly programs. Try to assemble
and run them (with S2 tools)
1
Feb
2.1) Work out by hands, the step-by-step multiplication algorithm in
lecture slide 'arithmetic' page 4, fig 3.5,3.6 (in textbook)
2.2)
Write S2 program to add 2 to all numbers in the array of ten
numbers. Try to run it in the simulator.
16
Feb
Use Logicwork to build a full-adder
23 Feb
Work out the control sequence to execute "add immediate" and "jump and link"
(page 64,65 of the textbook). What control lines are asserted and when?
30 March
Thinking about the efficiency of the five-stage pipeline. Ignore the
effects of hazards. Are these four statements correct?
1 Allowing branches and ALU instructions to take fewer stages than the
five required by the load instruction will increase pipeline performance
under all circumstances.
2 Trying to allow some instructions to take fewer cycles does not
help, since the throughput is determined by the clock cycle; the number of
pipe stages per instruction affects latency, not throughput.
3 You cannot make ALU instructions take fewer cycles because of the
write-back of the result, but branches can take fewer cycles, so there is
some opportunity for improvement.
4 Instead of trying to make instructions take fewer cycles, we should
explore making the pipeline longer, so that instructions take more cycles,
but the cycles are shorter. This could improve performance.
.
. .
Additional
information
Reference
textbook:
Computer
Organization and Design: The hardware/software interface (RISC-V edition)
D. Patterson, J. Hennessy, Morgan Kaufman, 2018. link
to Amazon
Transistor
level simulation of an antique CPU. 6502 is used in the iconic Apple
II machine.
http://visual6502.org/JSSim/index.html
last
update 5 August 2021