### Simple Processor Design Multiple Cycle Implementation Chapter 5.5 EEC170 FQ 2005 Courtesy of Prof. Kent Wilken ### **Multicycle Implementation** - Instructions that use more functional units (e.g., Load) take more cycles - Instructions that use fewer units (e.g., Jump) take fewer cycles - Maybe CPI x CCT will be lower ## Multicycle Implementation New latches hold intermediate results between clock cycles IR and MDR get output of memory A and B get output of Register File ALUout gets output of ALU Address Instruction Register Read Reg 1 Read Reg 2 Read Reg 2 Read Reg 2 Read Reg 2 # Multicycle Implementation: LW • Instruction fetch PC Address Instruction Registers Read Reg 1 Registers Read Reg 2 Write Reg ### **Multicycle Clock Cycle Time** • CCT determined by slowest functional unit: Register file: 50psALU and adders: 100ps • Memory: 200ps ### **Multicycle CPI** • Cycles for each instruction class is: Load: 5Store: 4ALU Op: 4Branch: 3Jump: 3 • SPECint2000 instruction mix Load: 25% Store: 10% ALU Op: 52% Branch: 11% Jump: 2% ### **Multicycle CPI Computation** CPI total = $\Sigma$ CPI<sub>i</sub> x $f_i$ = 5x0.25 + 4x0.10 + 4x0.52 + 3x0.11 + 3x0.02 = 1.25 + 0.4 + 2.08 + 0.33 + 0.06 = 4.12 ### **Multicycle Performance** - Good news: - CCT = 200ps, 3x lower than single cycle design - Only one adder, one memory unit - Bad - CPI is higher by 4x than single cycle design - Control unit is much more complex (see next lecture) - Average instruction execution time (IET) = CPI x CCT = 4.12 x 200ps = 824ps - Worse performance than single cycle at 600ps! ### State Assignment A number is assigned to each state Many assignments are possible, 16!/6! for this machine Assignment usually made to minimize hardware cost Here assignment made to improve clarity ### **Truth Table for Control Lines** | Outputs | Input Values S[3-0] | | | | | | | | | | |-------------|---------------------|------|------|------|------|------|------|------|------|------| | | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000 | 1001 | | PCWrite | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | | PCWriteCond | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | 0 | | IorD | 0 | 0 | 0 | -1 | 0 | -1 | 0 | 0 | 0 | 0 | | MemRead | 1 | 0 | 0 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | | MemWrite | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | IRWrite | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | MemtoReg | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | | PCSource1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | | PCSource0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | 0 | | ALUOp1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | ALUOp0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | 0 | | ALUSrcB1 | 0 | - 1 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | ALUSrcB0 | 1 | - 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | ALUSrcA | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | | RegWrite | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | | RegDst | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - 1 | 0 | 0 | ### **Exceptions** - An exception is an event that occurs within the processor which requires intervention from the OS - An exception causes execution to change from current program to OS exception handler - Example of exceptions include: - Address bounds violation - Arithmetic Overflow - Divide by Zero - Illegal opcode - Call to OS from user program ### **Exception Handling** - An exception is much like a procedure call - Requires address of where exception occurred - So OS can return to program after exception handling, if appropriate - So exception handler can identify instruction causing exception for proper exception processing - Requires a parameter indicating what caused the exception - Address and parameter cannot be written to normal registers, otherwise current program state would be destroyed - Special exception registers are required ### **Datapath Support for Exceptions** - All exceptions jump to a fixed, hard-wired address within OS: exception handler entry point - Address of where exception occurred is stored in exception program counter (EPC) - Read by exception handler using special instruction - Exception parameter is stored in cause register