EEC272

EEC272: High-Performance Computer Architecture: Super-Scalar Processor Design

This course consists of lectures dealing with topics and issues in architecture and design of complex and high-performance computer systems. Those issues range from efficient instruction set design, pipelining, advanced pipelining and super-scalar computer implementation, exploitation of instruction level parallelism, memory hierarchy and multiprocessing. Thos issues are foundation of modern computer architecture and design.

The lecture will follow the textbook closely chapter by chapter. The schedule of lectures and what will be covered each week is given. In addition, I am planning on inviting a few visiting lectures by prominent computer designers or architects. The book will be enhanced by selected readings of some of the fundamental papers in computer architecture and design. The list of those papers is given bellow. In addition I recommend the excellent alternate book by Mike Johnson.

Students who want to carry an independent project that would lead to their MSc thesis or journal/conference paper are encouraged to do so. It will be graded on a different scale but it is not required.

This course is intended for a graduate student in electrical and computer engineering, as well as for the practicing engineer. It is intended to provide a useful and needed reference to a collection of accumulated experience necessary for a good and successful design and understanding of complex computer systems today.

Schedule of the Lectures

Homework Assignments

(Homework's are due on Tuesday the following week)

Homework Solutions

Week 1: April 1, 3

Reading, Chpt. 1: Processor Design:

- The Evolution of Microprocessors -Instruction Set Processor Design

-Principles of Processor Performance

-Instruction-Level Parallel Processing

Homework 1: (due 8th)

1-13.

Week 2: April 8-10

Reading, Chpt. 2: Pipelined Processors:

-Pipeline Design Fundamentals

-Pipelined Processor Design

-Deeply Pipelined Processors

-Encoding an Instruction Set

Homework 2: (due 15th)

2.5: 1-14

Week 3: April 15-17

Reading: Chpt. 3: Superscalar Organization:

-Limitation of Scalar Pipelines

-From Scalar to Superscalar Pipelines

-Superscalar Pipeline Overview

Homework 3: (due 22)

3.5: 1-8, 13-17

Week 4: April 22-24

Reading: Chpt. 4: Superscalar Techniques:

-Instruction Flow Techniques

-Register Data Flow Techniques

-Memory Data Flow Techniques

Homework 4: (due 29th)

4.1-6; 4.7-10

Week 5: April 29, May 1st

Reading: Chpt. 7: Survey of Superscalar Processors:

-Development of Superscalar Processors

-A Classification of Recent Designs

Homework 5: (due 6th)

Week 6: May 6-8

Reading: Chpt. 7: Survey of Superscalar Processors:

Processor Descriptions:

Compaq / DEC Alpha
Hewlett-Packard PA-RISC

Homework 6: (due 13th)

Week 7: May 13-15

Reading: Chpt. 6,7: Survey of Superscalar Processors:

Processor Descriptions:

Intel i960
Intel IA32
MIPS
Motorola 88220
IBM Power

Homework 7: (due 20th)

Week 8: May 20-22

Reading: Chpt. 5,7: Survey of Superscalar Processors:

Processor Descriptions:

IBM Power
PowerPC: The PowerPC 620
SPARC Version 8 &9

Homework 8: (due 27th)

Week 9: May 27-29

Reading: Chpt. 8: Executing Multiple Threads:

-Synchronizing Shared-Memory Threads

-Introduction to Multi-Processor Systems

-Explicitly Multithreaded Processors

-Implicitly Multithreaded Processors

-Executing the Same Thread

Homework 9: (due 3th)

Week 10: June 3-5
Reading: Chpt. 9: Advanced Register Data Flow Techniques:

-Value Locality and Redundant Execution

-Exploiting Value Locality without Speculation

-Exploiting Value Locality with Speculation

[1] C.J. Bashe et al., "The Architecture of IBM's Early Computers", IBM Journal of Research and Development, 25:5 (Sep 1981), p.363-375.

[2] G.A. Blaauw and F.P. Brooks, "The Structure of System/360", IBM Sysetms Journal, 3:2 (1964), p.119-135.

[3] A. Padegs, "System/360 and Beyond", IBM Journal of Research and Development, 25:5 (Sep 1981), p.377-390.

[4] James E. Thornton, "Parllel Operation in the Control Data 6600", AFIPS Proceedings FJCC part 2, vol 26 (1964), p.33-40.

[5] R.M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units", IBM Journal, vol 11 (Jan 1967), p.25-33.

[6] D.W. Anderson et al., "The IBM System/360 Model 91: Machine Philosophy and Instruction Handling", IBM Journal, vol 11 (Jan 1967), p.8-24.

[7] Johnny K.F. Lee and A.J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design", Computer, 17:1 (1984), p.6-22.

[8] Wen-mei W. Hwu et al., "Comparing Software and Hardware Schemes for Reducing the Cost of Branches".

[9] A.J. Smith, "Cache Memory Design: An Evolving Art", IEEE Spectrum, (Dec 1987), p.40-44.

[10] A.J. Smith, "CPU Cache Memories", (Apr 1984), updated version of ACM Surveys, 14:3 (Sep 1982), p.473-530.

[12] J.S. Lipton, "Structural Aspects of the System/360 Model 85 - II The Cache", IBM Systems Journal, 7:1 (1968), p.5-21.

[13] Peter J. Denning, "Virtual Memory", Computing Surveys, 2:3 (Sep 1970),
p.153-189.

[14] Albert Chang and Mark F. Mergen, "801 Storage: Architecture and Programming", ACM Transactions on Computing Systems, 6:1 (Feb 1988), p.28-30.

[15] C.V. Ramamourthy, "Pipeline Architecture", Computing Surveys, 9:1 (Mar 1977), p.61-101.