I am learning about pipelining and was reading about control hazards from the book Computer Organization and Design: The Hardware/Software Interface (MIPS Edition). There is a paragraph in the book (Chapter 4.6) that has me puzzled:
Let's assume that we put in enough extra hardware so that we can test registers, calculate the branch address, and update the PC during the second stage of the pipeline (see COD Section 4.9 (Control hazards) for details). Even with this extra hardware, the pipeline involving conditional branches would look like the figure below. The lw instruction, executed if the branch fails, is stalled one extra 200 ps clock cycle before starting.
I don't quite understand what exactly the paragraph is saying here. My initial guess was that it meant even in a hypothetical scenario where you can determine which branch to take and update the program counter within the one-clock cycle afforded before the next instruction must be fetched, we would still need a stall but that doesn't make sense to me because if we know what to do, why not just do it and go about it? So I assume I am clearly missing something but I can't piece it together.
To sustain 1 instruction per clock (1 IPC) with no stalls, the IF stage needs to be fetching literally every cycle. That requires an address to fetch from.
But if it takes an extra cycle after a fetch to compute the next place to fetch from (branch latency of 1 cycle), that's a cycle you can't be fetching (or might not be usefully fetching).
Even unconditional branches need prediction if you want a branch latency of 0 (no stalls). Related:
jmp
(or MIPS j
or b
) need prediction if you want to run them at 1/clock.