EEC 272: High-Performance Computer Architecture


Prof. Vojin G. Oklobdzija




Midterm:    30%

Project:       20%

Quizzes:     40%

Extras:        10%


Catalog Description:


Architectural issues in achieving high-performance in a single processor via concurrent execution of instructions that typically form a general-purpose computer as well as problems and fundamental limitations will be discussed. The course will also cover specialized architectures such as VLIW and Vectors.



Expanded Course Description:


1. Introduction to Super-scalar Concepts:


                1.1 Beyond Pipelining, CISC and RISC

                1.2 Instruction Issue and Machine Parallelism

                1.3 Fundamental Limitations

                1.3 Related Concepts: VLIW and Vectors

                1.4 Unrelated Parallel Schemes


2. Developing and Execution Model:


                2.1 Simulation Technique

                2.2 Benchmarking Performance

                2.3 Basic Observations on Hardware Design

                2.4 The Design of the Standard Processor

                2.5 Procedural Dependencies


3. Instruction Fetching and Decoding


                3.1 Branches and Instruction-Fetch Inefficiencies

                3.2 Improving Fetch Efficiency

                3.3 Implementing Hardware Branch-Prediction

                3.4 Implementing Four Instruction Decoder

                3.5 Implementing Branches

                3.6 Reducing the Penalty of Procedural Dependencies


4. Advanced Pipelining:


                4.1 Pipeline Design

                4.2 Pipeline Scheduling - reservation tables

                4.3 Pipeline Hazards and conflict resolution

                4.4 Multi-level pipelines


5. The Role of Exception Recovery:


                5.1 Buffering State Information for Restart

                5.2 Restart Implementation and Effect on Performance

                5.3 Processor Restart


6. Register Dataflow:


                6.1 Dependency Mechanisms

                6.2 Result Buses and Arbitration

                6.3 Result Forwarding

                6.4 Supplying Instruction Operands


7. Out-of-Order Issue:


                7.1 Reservation Stations

                7.2 Implementing a Central Instruction Window

                7.3 Out of Order Issue



8. Memory Dataflow:


                8.1 Ordering of Loads and Stores

                8.2 Addressing and Dependencies

                8.3 What is more Load/Store Parallelism worth ?

                8.4 Multiprocessing Considerations

                8.5 Accessing External Data


9. Complexity and Controversy


                9.1 Design Complexity

                9.2 Major Hardware Features

                9.3 Hardware Simplifications

                9.4 Is the Complexity Worth ?


10. Evaluating Alternatives: A Perspective on Superscalar Processors


                10.1 The Case for Software Solutions

                10.2 The Case for Hardware Solutions

                10.3 VLIW Machines

                10.4 Vector Processors (Cray)

                10.5 Massively Parallel Machines




1.        Mike Johnson, "Superscalar Microprocessor Design", Prentice Hall 1991.

2.        Hennessy-Patterson, "Computer Architecture: A Quantitative Approach", Morgan-Kaufman Publishing.

3.        V. Oklobdzija, Set of selected papers and notes.