EEC 272: High-Performance Computer Architecture

 

Prof. Vojin G. Oklobdzija

 

Grading:

 

Midterm:    30%

Project:       20%

Quizzes:     40%

Extras:        10%

 

Catalog Description:

 

Architectural issues in achieving high-performance in a single processor via concurrent execution of instructions that typically form a general-purpose computer as well as problems and fundamental limitations will be discussed. The course will also cover specialized architectures such as VLIW and Vectors.

 

 

Expanded Course Description:

 

1. Introduction to Super-scalar Concepts:

 

                1.1 Beyond Pipelining, CISC and RISC

                1.2 Instruction Issue and Machine Parallelism

                1.3 Fundamental Limitations

                1.3 Related Concepts: VLIW and Vectors

                1.4 Unrelated Parallel Schemes

 

2. Developing and Execution Model:

 

                2.1 Simulation Technique

                2.2 Benchmarking Performance

                2.3 Basic Observations on Hardware Design

                2.4 The Design of the Standard Processor

                2.5 Procedural Dependencies

               

3. Instruction Fetching and Decoding

 

                3.1 Branches and Instruction-Fetch Inefficiencies

                3.2 Improving Fetch Efficiency

                3.3 Implementing Hardware Branch-Prediction

                3.4 Implementing Four Instruction Decoder

                3.5 Implementing Branches

                3.6 Reducing the Penalty of Procedural Dependencies

 

4. Advanced Pipelining:

 

                4.1 Pipeline Design

                4.2 Pipeline Scheduling - reservation tables

                4.3 Pipeline Hazards and conflict resolution

                4.4 Multi-level pipelines

 

5. The Role of Exception Recovery:

 

                5.1 Buffering State Information for Restart

                5.2 Restart Implementation and Effect on Performance

                5.3 Processor Restart

 

6. Register Dataflow:

 

                6.1 Dependency Mechanisms

                6.2 Result Buses and Arbitration

                6.3 Result Forwarding

                6.4 Supplying Instruction Operands

 

7. Out-of-Order Issue:

 

                7.1 Reservation Stations

                7.2 Implementing a Central Instruction Window

                7.3 Out of Order Issue

 

 

8. Memory Dataflow:

 

                8.1 Ordering of Loads and Stores

                8.2 Addressing and Dependencies

                8.3 What is more Load/Store Parallelism worth ?

                8.4 Multiprocessing Considerations

                8.5 Accessing External Data

 

9. Complexity and Controversy

 

                9.1 Design Complexity

                9.2 Major Hardware Features

                9.3 Hardware Simplifications

                9.4 Is the Complexity Worth ?

 

10. Evaluating Alternatives: A Perspective on Superscalar Processors

 

                10.1 The Case for Software Solutions

                10.2 The Case for Hardware Solutions

                10.3 VLIW Machines

                10.4 Vector Processors (Cray)

                10.5 Massively Parallel Machines

                 

 

Textbook:

1.        Mike Johnson, "Superscalar Microprocessor Design", Prentice Hall 1991.

2.        Hennessy-Patterson, "Computer Architecture: A Quantitative Approach", Morgan-Kaufman Publishing.

3.        V. Oklobdzija, Set of selected papers and notes.