### Midterns on the table

### Lecture 18: Pipelining



Last names: A-D E-L M-R 5-Z

(for the most part)

• Today's topics:

- Power and energy
- 5-stage pipeline
- Hazards
- Data dependence handling with bypassing
- Data dependence examples

Email me if your midtern Score on Canvas is zero!!

> HW7 posked later today

#### Midterm Notes

### 240 43 45

• Grade categories:



- Common mistakes:
  - Support for subtraction: not including the carry
  - IEEE 754 floating point formats and addition
  - Addition of signed numbers
  - − Logic gates, Sum-of-products
  - MARS variables, syscalls
  - − Power and energy !!! ←



# Power and Energy

# Power and Energy

# A 5-Stage Pipeline

# Cycle time = 1.2ms



# Performance Improvements?

Does it take longer to finish each individual job?

• Does it take shorter to finish a series of jobs?

Yes, before of paralle

- What assumptions were made while answering these questions?
  - No dependences between instructions
  - Easy to partition circuits into uniform pipeline stages
  - No latch overhead

5x inpo

ideal cond

• Is a 10-stage pipeline better than a 5-stage pipeline?

### **Quantitative Effects**

Cooly from upppelined to pipelined

- As a result of pipelining:
- Time in ns per instruction goes up becoz of latch orbid

  Each instruction takes more cycles to execute definish cycle has

  But average CPI remains roughbath. Sound  $\leq >$  Time in ns per instruction goes up

  - > But... average CPI remains roughly the same
  - Clock speed goes up
  - Total execution time goes down, resulting in lower average time per instruction
    - Under ideal conditions, speedup
      - = ratio of elapsed times between successive instruction completions
      - = number of pipeline stages = increase in clock speed

# Hazards

- Structural hazards: different instructions in different stages (or the same stage) conflicting for the same resource
- Data hazards: an instruction cannot continue because it needs a value that has not yet been generated by an earlier instruction
- Control hazard: fetch cannot continue because it does not know the outcome of an earlier branch – special case of a data hazard – separate category because they are treated in different ways

# **Conflicts/Problems**



IM DM

- I-cache and D-cache are accessed in the same cycle it helps to implement them separately
- Registers are read and written in the same cycle easy to deal with if register read/write time equals cycle time/2
- Instructions can't skip the DM stage, else conflict for RW
- Consuming instruction may have to wait for producer
- Branch target changes only at the end of the second stage
  -- what do you do in the meantime?

#### Structural Hazards

- Example: a unified instruction and data cache 
   stage 4 (MEM) and stage 1 (IF) can never coincide
- The later instruction and all its successors are delayed until a cycle is found when the resource is free → these are pipeline bubbles
- Structural hazards are easy to eliminate increase the number of resources (for example, implement a separate instruction and data cache, add more register ports)

### **Data Hazards**



- An instruction *produces* a value in a given pipeline stage
- A subsequent instruction consumes that value in a pipeline stage
- The consumer may have to be delayed so that the time of consumption is later than the time of production

| • Show the instruction occupying each stage in each cycle (no bypassing) if I1 is $R1+R2 \rightarrow R3$ and I2 is $R3+R4 \rightarrow R5$ and I3 is $R7+R8 \rightarrow R9$ |           |           |          |          |           |           |          |                     |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------|----------|----------|-----------|-----------|----------|---------------------|
| CYC-1                                                                                                                                                                      | CYC-2     | CYC-3     | CYC-4    | CYC-5    | CYC-6     | CYC-7     | CYC-8    |                     |
| IF<br>I1                                                                                                                                                                   | IF<br>I2  | IF<br>I3  | IF<br>I3 | IF<br>T3 | IF        | IF        | IF       | CP1<br>= 5          |
| D/R                                                                                                                                                                        | D/R<br>I1 | D/R//     | D/R/     | D/R/     | D/R<br>I3 | D/R       | D/R      | = 5                 |
| ALU                                                                                                                                                                        | ALU       | ALU<br>I1 | ALU      | ALU      | ALU<br>IZ | ALU<br>I3 | ALU      | 1PC<br>= 0.6<br>= 3 |
| DM                                                                                                                                                                         | DM        | DM        | DM<br>II | DM<br>O  | DM<br>()  | DM<br>T2  | DM<br>T3 | 5                   |
|                                                                                                                                                                            |           |           |          | 1/1      |           |           |          | 0.                  |

**RW** 

**RW** 

**RW** 

**RW** 

**RW** 

**RW** 

# Example 1 – No Bypassing

• Show the instruction occupying each stage in each cycle (no bypassing) if I1 is R1+R2 $\rightarrow$ R3 and I2 is R3+R4 $\rightarrow$ R5 and I3 is R7+R8 $\rightarrow$ R9

| II II IS NI INZ 7 NO WING IZ IS NOTH 7 NO WING IS IS NOT NO 7 NO |       |       |       |       |       |       |       |
|------------------------------------------------------------------|-------|-------|-------|-------|-------|-------|-------|
| CYC-1                                                            | CYC-2 | CYC-3 | CYC-4 | CYC-5 | CYC-6 | CYC-7 | CYC-8 |
| IF<br>11                                                         | IF    |
| l1                                                               | 12    | 13    | 13    | 13    | 14    | 15    |       |
| D/R                                                              | D/R   | D/R   | D/R   | D/R   | D/R   | D/R   | D/R   |
|                                                                  | l1    | 12    | 12    | 12    | 13    | 14    |       |
| ALU                                                              | ALU   | ALU   | ALU   | ALU   | ALU   | ALU   | ALU   |
|                                                                  |       | I1    |       |       | 12    | 13    |       |
| DM                                                               | DM    | DM    | DM    | DM    | DM    | DM    | DM    |
|                                                                  |       |       | l1    |       |       | 12    | 13    |
| RW                                                               | RW    | RW    | RW    | RW    | RW    | RW    | RW    |
|                                                                  |       |       |       | 11    |       |       | 12    |

# Example 2 – Bypassing

• Show the instruction occupying each stage in each cycle (with bypassing) if I1 is R1+R2→R3 and I2 is R3+R4→R5 and I3 is R3+R8→R9. Identify the input latch for each input operand.



### Example 2 – Bypassing

Show the instruction occupying each stage in each cycle (with bypassing) if I1 is R1+R2→R3 and I2 is R3+R4→R5 and I3 is R3+R8→R9.
 Identify the input latch for each input operand.

| CYC-1    | CYC-2     | CYC-3              | CYC-4              | CYC-5              | CYC-6    | CYC-7    | CYC-8 |
|----------|-----------|--------------------|--------------------|--------------------|----------|----------|-------|
| IF<br>I1 | IF<br>I2  | IF<br>I3           | IF<br>I4           | IF<br>I5           | IF       | IF       | IF    |
| D/R      | D/R<br>I1 | D/R<br>I2<br>L3 L3 | D/R<br>13<br>L4 L3 | D/R<br>I4<br>L5 L3 | D/R      | D/R      | D/R   |
| ALU      | ALU       | ALU<br>I1          | ALU<br>12          | ALU<br>I3          | ALU      | ALU      | ALU   |
| DM       | DM        | DM                 | DM<br>I1           | DM<br>I2           | DM<br>I3 | DM       | DM    |
| RW       | RW        | RW                 | RW                 | RW<br>I1           | RW<br>I2 | RW<br>I3 | RW    |

### Problem 1



### Problem 2



### Problem 3

