Projects for Advanced topic in digital systems 2001

Due date Friday 18th January 2002

Aim

The aim of the project is to reenforce students' understanding of the issues related in designing and evaluating an architecture.  The level of abstraction that we adopt is ISA and instruction level parallelism.  Students are asked to design some performance enhancement feature for a base-line architecture and to measure the effectiveness of his/her design.  Students must "analyse" the benchmark program result, "design" the enhancement and "evaluate" the design quantitatively.  Due to the time limitation, the level of detail in the design is upto individual student.  Don't be too ambitious and tackle an overly complex concept.  A "working" design will give plenty of data for performance analysis.  A "non-working" design will not yield any quantitative result.

Problem definition

The objective is to design or modify S2 machine and run benchmark programs on its simulator and report its performance (CPI). Basically what you have to do is to "design" a machine, i.e. its instruction set and/or its behaviour (microstep),  modify/write a simulator and run some benchmark program chosen from my Stanford integer benchmark suite or you can write your own benchmark program to demonstrate your design.  If you cannot make the simulation to work  you can submit your design and simulate it by hand.

The project must be handed in not later than the deadline.  No late submission will be accepted.

Performance enhancement to S2

1  ISA related

Analyse instruction usage and recommend instruction set design improvement such as combining test-and-branch, introduce new addressing mode, inventing new instructions etc.

2  Microprogramming related

Foregone the fetching instructions from the memory.  Compiling application programs into microcode.  You have to redesign microprogram format.

Design microprogram format, to make it smaller, such as 2-level microprogram, measure the effect on performance.

Design microprogram format, to make it faster, such as doing more than one thing in a cycle, may require new capability of register files to perform read-write in one cycle etc.  Measure the effect on performance and the size of microstore.

3  Pipeline related

Assuming the memory access is more than one clock.  Design the pipeline to hide this memory access latency.  Measure the effect on performance.

Design a different pipeline (not 5 stages) with S2 ISA.

4  Superscalar related

Adding more ALU (or functional units) to the processor.  Control the execution using scoreboard or Tomasulo style.  Analyse the performance.

Doing VLIW version of S2, perhaps 4 execution units.  Hand compile a benchmark and measure its performance.

Project assignment

How to do the project

1) You have to design an instruction set with enough instruction to execute some benchmark program (no I/O).

2) You have to design "micro architecture", i.e. the internal structure of a cpu and write its "micro step".

3) A set of benchmark program (Stanford integer benchmark) written in C and in RZ is provided.  A RZ compiler and interpreter (runtime VM) are provided so that you can compile and test your benchmark programs.  The compiler also generate an assembly file for S2 which can be assemble by AS2, an assembler for S2 and run in S2 simulator.  You have to instrument the simulator to collect quantitative data such as the number of instruction executed, the frequency of use of each instruction or the number of stall in the pipeline etc.  This is the base-line architecture that you will compare your design with.

You can modify this output assembly language to include your design (new instruction etc.) and/or make change to the simulator (in case you change microarchitecture, pipelining or superscalar).  If you have trouble with modification of simulator to run the whole benchmark program, you don't have to run the whole program. You can choose some portion of programs to measure your design. The most important portion that determine the runtime of a program is its "innest loop".  The benchmark should be taken from several parts of high level program and should be at least 10-20 lines of assembly code in total.  Choose the benchmark that will illustrate the capability of your design.

4) To validate (check that the design is correct) and evaluate (measure how fast your cpu is) the design, you can either :

  1. Write (modify) a simulator and run the benchmark programs to count the number of clocks required to complete the tasks  (you will earn extra bonus for doing the simulation) or
  2. Estimate the number of clocks by hand.  You need to make sure that you count the right thing.
5) You must hand in a report containing the following sections :
  1. Motivation behind your design (why you did it that way).
  2. Instruction set details : opcode, opcode format, number of clock required by each instruction.
  3. Microarchitecture and its microstep.
  4. Your benchmark program ( in your assembly language) and why you choose this particular part of the program.  Programs should be well commented so that I can read and understand what it does.
  5. How you validate and evaluate your design.
  6. Performance of your design (Cycle Per Instruction)
  7. Conclusion, what you learn from this project.

Tools for the project

S2 simulator
S2 assembler
RZ compiler and interpreter

How I evaluate your project

I will look for the "quality" of your work including: Any question regarding the project, please contact me promptly.