EEC272: High-Performance Computer Architecture: Super-Scalar Processor Design


Prof. Vojin G. Oklobdzija
Electrical and Computer Engineering Department
University of California


Content of the Course:

This course consists of lectures dealing with topics and issues in architecture and design of complex and high-performance computer systems. Those issues range from efficient instruction set design, pipelining, advanced pipelining and super-scalar computer implementation, exploitation of instruction level parallelism, memory hierarchy and multiprocessing. Thos issues are foundation of modern computer architecture and design.

The lecture will follow the textbook closely chapter by chapter. The schedule of lectures and what will be covered each week is given. In addition, I am planning on inviting a few visiting lectures by prominent computer designers or architects. The book will be enhanced by selected readings of some of the fundamental papers in computer architecture and design. The list of those papers is given bellow. In addition I recommend the excellent alternate book by Mike Johnson.

Students who want to carry an independent project that would lead to their MSc thesis or journal/conference paper are encouraged to do so. It will be graded on a different scale but it is not required.

This course is intended for a graduate student in electrical and computer engineering, as well as  for the practicing engineer. It is intended to provide a useful and needed reference to a collection of accumulated experience necessary for a good and successful design and understanding of complex computer systems today.

Topics covered and schedule:


Schedule of the Lectures

Homework Assignments

(Homework's are due on Tuesday the following week)

Homework Solutions

Week 1: April 1, 3

Reading, Chpt. 1: Processor Design:

- The Evolution of Microprocessors -Instruction Set Processor Design

-Principles of Processor Performance

-Instruction-Level Parallel Processing

Homework 1: (due 8th)


Week 2: April 8-10 

Reading, Chpt. 2: Pipelined Processors:

-Pipeline Design Fundamentals

-Pipelined Processor Design

-Deeply Pipelined Processors

-Encoding an Instruction Set

Homework 2: (due 15th)

2.5: 1-14



Week 3: April  15-17  

Reading: Chpt. 3: Superscalar Organization:

-Limitation of Scalar Pipelines

-From Scalar to Superscalar Pipelines

-Superscalar Pipeline Overview

Homework 3: (due 22)

3.5: 1-8, 13-17



Week 4: April  22-24  

Reading: Chpt. 4: Superscalar Techniques:

-Instruction Flow Techniques

-Register Data Flow Techniques

-Memory Data Flow Techniques

Homework 4: (due 29th)

4.1-6; 4.7-10



Week 5: April  29, May 1st

Reading: Chpt. 7: Survey of Superscalar Processors:

-Development of Superscalar Processors

-A Classification of Recent Designs

Homework 5: (due 6th)




Week 6: May  6-8

Reading: Chpt. 7: Survey of Superscalar Processors:

Processor Descriptions:

  1. Compaq / DEC Alpha
  2. Hewlett-Packard PA-RISC

Homework 6: (due 13th)




Week 7: May  13-15

Reading: Chpt. 6,7: Survey of Superscalar Processors:

Processor Descriptions:

  1. Intel i960
  2. Intel IA32
  3. MIPS
  4. Motorola 88220
  5. IBM Power

Homework 7: (due 20th)




Week 8: May  20-22

Reading: Chpt. 5,7: Survey of Superscalar Processors:

Processor Descriptions:

  1. IBM Power
  2. PowerPC: The PowerPC 620
  3. SPARC Version 8 &9

Homework 8: (due 27th)




Week 9: May  27-29  

Reading: Chpt. 8: Executing Multiple Threads:

-Synchronizing Shared-Memory Threads

-Introduction to Multi-Processor Systems

-Explicitly Multithreaded Processors

-Implicitly Multithreaded Processors

-Executing the Same Thread

Homework 9: (due 3th)




Week 10: June  3-5 
Reading: Chpt. 9: Advanced Register Data Flow Techniques:

-Value Locality and Redundant Execution

-Exploiting Value Locality without Speculation

-Exploiting Value Locality with Speculation






List of recommended readings:

[1] C.J. Bashe et al., "The Architecture of IBM's Early Computers", IBM Journal of Research and Development, 25:5 (Sep 1981), p.363-375.

[2] G.A. Blaauw and F.P. Brooks, "The Structure of System/360", IBM Sysetms Journal, 3:2 (1964), p.119-135.

[3] A. Padegs, "System/360 and Beyond", IBM Journal of Research and Development, 25:5 (Sep 1981), p.377-390.

[4] James E. Thornton, "Parllel Operation in the Control Data 6600", AFIPS Proceedings FJCC part 2, vol 26 (1964), p.33-40.

[5] R.M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units", IBM Journal, vol 11 (Jan 1967), p.25-33.

[6] D.W. Anderson et al., "The IBM System/360 Model 91: Machine Philosophy and Instruction Handling", IBM Journal, vol 11 (Jan 1967), p.8-24.

[7] Johnny K.F. Lee and A.J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design", Computer, 17:1 (1984), p.6-22.

[8] Wen-mei W. Hwu et al., "Comparing Software and Hardware Schemes for Reducing the Cost of Branches".

[9] A.J. Smith, "Cache Memory Design: An Evolving Art", IEEE Spectrum, (Dec 1987), p.40-44.

[10] A.J. Smith, "CPU Cache Memories", (Apr 1984), updated version of ACM Surveys, 14:3 (Sep 1982), p.473-530.

[12] J.S. Lipton, "Structural Aspects of the System/360 Model 85 - II The Cache", IBM Systems Journal, 7:1 (1968), p.5-21.

[13] Peter J. Denning, "Virtual Memory", Computing Surveys, 2:3 (Sep 1970),

[14] Albert Chang and Mark F. Mergen, "801 Storage: Architecture and Programming", ACM Transactions on Computing Systems, 6:1 (Feb 1988), p.28-30.

[15] C.V. Ramamourthy, "Pipeline Architecture", Computing Surveys, 9:1 (Mar 1977), p.61-101.