Assignment 8
Due: 10:45am, Tue Apr 2nd, 2024
Note: Make reasonable assumptions where necessary and clearly state them.
Feel free to discuss problems with classmates, but the only written material
that you may consult while writing your solutions are the textbook
and lecture slides/videos.
Solutions should be uploaded on Gradescope.
Show your solution steps so you receive partial credit for incorrect
answers and we know you have understood the material. Don't just show us the
final answer.
Every homework has an automatic penalty-free 1.5 day extension to
accommodate any covid/family-related disruptions. In other words, try to
finish your homework by Tuesday 10:45am to keep up with the lecture
content, but if necessary, you may take until Wednesday 11:59pm.
-
Consider an in-order 5-stage pipeline similar to the one discussed in
class, e.g., see slides 3-9 of lecture 19. First assume that the pipeline does
not support bypassing (forwarding). What are the stall cycles introduced
between the following pairs of back-to-back instructions? Then, solve the
same problem while assuming support for bypassing. Clearly show your work,
i.e., show how each instruction goes through the 5 stages, indicate the point
of production and point of consumption, show how the consuming instruction
is held back in the D/R stage when there are stalls (similar to the example
on slide 3 of lecture 19). Recall that a register
read is performed in the second half of the D/R stage and a register write
is performed in the first half of the RW stage. (60 points)
- add $1, $2, $3
add $4, $1, $2
- lw $1, 8($2)
add $4, $1, $3
- lw $1, 8($2)
sw $3, 8($1)
- lw $1, 8($2)
sw $1, 8($4)
-
Consider a program that executes a large number of instructions.
Assume that the program does not suffer from stalls from data hazards
or structural hazards.
Assume that 16% of all instructions are branch instructions, and
20% of these branch instructions are Taken. What is the average
CPI for this program when it executes on each of the processors
listed below? All of these processors implement a 5-stage in-order pipeline
and resolve a branch outcome at the end of the 2nd stage (similar to the
5-stage pipeline discussed in class).
If it helps, assume that the program has 100 total
instructions and would finish in 100 cycles (CPI = 1.0) if it encountered
zero stall cycles. Then, figure out the stall cycles for each of the cases
below, so for example, 10 stall cycles would equate to an execution time of
110 cycles and a CPI of 1.1. (40 points)
- The processor pauses instruction fetch as soon as it fetches a branch.
Instruction fetch is resumed after the branch outcome has been
resolved.
- The processor always fetches instructions sequentially, i.e., it
predicts every branch as being Not-Taken. If a
branch is resolved as Taken, the incorrectly fetched instructions
after the branch are squashed.
- The processor implements a branch delay slot. The compiler is able to
fill the branch delay slot with an instruction that comes
before the branch in the original code.
- The processor does not implement branch delay slots. Instead, it
implements a hardware branch predictor that makes
correct predictions for 96% of all branches. When an incorrect
prediction is discovered, the incorrectly fetched instructions
after the branch are squashed.