### EEC170 Computer Architecture

### **Lecture 2: Instruction Set Architectures**

October 10, 2005 Soheil Ghiasi Electrical and Computer Engineering University of California, Davis











## **Types of Machine Organization**

- Memory-to-Memory Machines
- Accumulator
- Stack
- General Purpose Register Machines

# Memory to Memory Machine

- Every instruction contains a full memory address for each operand
- **\***Assumptions
  - \* two operands per operation
  - second operand is also the destination
  - \* memory address 16 bits (2 bytes)
  - \* operand size 32 bits (4 bytes)
  - \* instruction code 8 bits (1 byte)
  - \* we want to evaluate A  $\leftarrow$  (B+C+D+E)/X

# **Memory to Memory Machines**

 $A \leftarrow (B+C+D+E)/X$ 

**\***Hypothetical assembly language code

| <b>∻</b> move B, A; | A ← B                |                                               |
|---------------------|----------------------|-----------------------------------------------|
| ∻add C, A;          | A ← A + C            | ( <b>B</b> + <b>C</b> )                       |
| ∻add D, A;          | A ← A + D            | ( <b>B</b> + <b>C</b> + <b>D</b> )            |
| ∻add E, A;          | <b>A ← A + E</b>     | ( <b>B</b> + <b>C</b> + <b>D</b> + <b>E</b> ) |
| * div X. A:         | $A \leftarrow A / X$ |                                               |

Anadd or divide instruction results in the transfer of 17 bytes between memory and CPU

- 5 bytes for instruction (opcode + 2 memory addresses)
- \* 4 bytes each to fetch 1st and 2nd operands (8 bytes total)
- 4 bytes to store result

\*Moverrequiresel3; Total memory/trafficem4717++CBU=681 bytes move?



# **Accumulator Machines**

- **\***CPU storage consists of a single accumulator
- Accumulator is an (implicit) operand for most instructions
- Memory is source of 2nd operand for 2 operand instructions
- \*Cheap (in terms of CPU hardware) and simple
- \*Memory traffic can be reduced significantly more if more local storage is available

# Accumulator Machine

\*Consider a machine with 1 cell of CPU storage: the accumulator

\* Assumptions

two operands per operation

- \* 1st operand is in the accumulator (implicit)
- 2nd operand is in memory
- accumulator is also the destination of the operation except for store
- \* memory address 16 bits (2 bytes)
- operand size 32 bits (4 bytes)
- \* instruction code 8 bits (1 byte)
- \* we want to evaluate A  $\leftarrow$  (B + C + D + E) / X

# **Accumulator Machine**

- Assembly language code hypothetical
  - ♦ load B; acc ← B
  - ♦ add C; acc ← acc + C (B+C)
  - \* add D; acc  $\leftarrow$  acc + D (B+C+D)
  - \* add E; acc  $\leftarrow$  acc + E (B+C+D+E)
  - ♦ div X; acc ← acc / X
  - ♦ store A; A ← acc
- Each of the shower instructions results in the tyansfer of 7 bytes between memory and CPU
  - \* 3 bytes for instruction 4 bytes to fetch or store 2nd operand
  - Total memory traffic 6x7 = 42 bytes
- \* Moral: local CPU storage reduces memory traffic and also effects the instruction set design



# Stack Machines

\*Binary arithmetic and logic operations

- \* operands: top 2 items on stack
- operands are removed from stack
- \* result is placed on top of stack

\* Unary arithmetic and logic operations

- operand: top item on the stack
- operand is replaced by result of operation

#### \* Data move operations

- \* push: place memory data on top of stack
- \* pop: move top of stack to memory

| Evaluate exp                                                                                           | ression $A \leftarrow (B + C + D)$ | $(\mathbf{F} + \mathbf{E}) / \mathbf{X}$ |  |  |
|--------------------------------------------------------------------------------------------------------|------------------------------------|------------------------------------------|--|--|
| push B;                                                                                                | top of stack "tos":                | В                                        |  |  |
| push C;                                                                                                | tos:                               | B, C                                     |  |  |
| ∻add;                                                                                                  | tos:                               | B+C                                      |  |  |
| push D;                                                                                                | tos:                               | B+C, D                                   |  |  |
| ∻add;                                                                                                  | tos:                               | B+C+D                                    |  |  |
| • push E                                                                                               | tos:                               | B+C+D, E                                 |  |  |
| ∻add;                                                                                                  | tos:                               | B+C+D+E                                  |  |  |
| v push X;                                                                                              | tos:                               | B+C+D+E, X                               |  |  |
| ∻div;                                                                                                  | tos:                               | B+C+D+E/X                                |  |  |
| * pop a;                                                                                               | $A \leftarrow (B+C+D+E)/X$         |                                          |  |  |
| * Alwardd/divide biosteructions fresults include an side of d7ybytes 10et ween<br>include 10 iarid CPU |                                    |                                          |  |  |
| <ul><li>1 bytes f</li></ul>                                                                            | or instruction opcode              |                                          |  |  |
| <ul> <li>0 bytes for data transfer</li> </ul>                                                          |                                    |                                          |  |  |

### **General Purpose Register Machine**

\*With stack machines only the top two elements of the stack are directly available to instructions. In general purpose register machines the CPU storage is organized as a set of registers which are equally available to the instructions

\*Frequently used operands are placed in registers (under program control)

- Reduces instruction size
- Reduces memory traffic



| Violau D, KI,                                                                | KI D                       |  |  |  |
|------------------------------------------------------------------------------|----------------------------|--|--|--|
| load C, R2;                                                                  | R2 ← C                     |  |  |  |
| load D, R3;                                                                  | R3 ← D                     |  |  |  |
| load E, R4;                                                                  | R4 ← E                     |  |  |  |
| load X, R5;                                                                  | R5 ← X                     |  |  |  |
| ♦ add R1, R2;                                                                | R1 ← B+C                   |  |  |  |
| ♦ add R1, R3;                                                                | $R1 \leftarrow B+C+D$      |  |  |  |
| ♦ add R1, R4;                                                                | $R1 \leftarrow B+C+D+E$    |  |  |  |
| ♦ div R1, R5 R1 ← (B+C+D+E)/X                                                |                            |  |  |  |
| store R1, A;                                                                 | A $\leftarrow$ (B+C+D+E)/X |  |  |  |
| * How many bytes for load/store? 7 ½ bytes                                   |                            |  |  |  |
| * How many bytes for add/div? 2 bytes                                        |                            |  |  |  |
| <b>*</b> Total mem. Xfer = 6*7 <sup>1</sup> / <sub>2</sub> + 4 *2 = 53 bytes |                            |  |  |  |
|                                                                              |                            |  |  |  |

# **Classifying GPR Machine**

- \*GPR machines are subclassifed based on whether or not memory operands can be used by typical ALU instructions
- Register-memory machines: machines where some ALU instructions can specify at least one memory operand and one register operand
- \*Load-store machines: register-register machines with the restriction that the only instructions that can access memory are the load and the store instructions
  - \* load transfers data from memory to a register
  - store transfers data from a register to memory



# **Recap: Basic ISA Classes**

| Memory to Memory:     | :         |                                     |
|-----------------------|-----------|-------------------------------------|
| ♦ 2 address           | add A B   | $mem[A] \leftarrow mem[A] + mem[B]$ |
| 3 address             | add A B C | $mem[A] \leftarrow mem[B] + mem[C]$ |
| Accumulator:          |           |                                     |
| 1 address             | add A     | $acc \leftarrow acc + mem[A]$       |
|                       | addx A    | $acc \leftarrow acc + mem[A + x]$   |
| ♦ Stack:              |           |                                     |
| ♦ 0 address           | add       | $tos \leftarrow tos + next$         |
| ♦ General Purpose Reg | gister:   |                                     |
| 2 address             | add A B   | $reg[A] \leftarrow reg[A] + reg[B]$ |
| 3 address             | add A B C | $reg[A] \leftarrow reg[B] + reg[C]$ |
|                       |           |                                     |

#### Comparison:

Bytes per instruction? Number of Instructions? Cycles per instruction?



## Conclusion

Instruction Set Architecture is the key abstraction between hardware designer and software developers

- Machine Organizations
  - Memory-to-Memory Machines
  - Accumulator
  - \*Stack
  - \*General Purpose Register Machines