|System: Four-Way Superscalar||Running: The gcc Compiler (cc1)|
The plots below show the execution of a short pointer chasing loop in which almost every link points to data which is not in the level-1 cache. There are no other instructions to execute during the 18-cycle load delay (except a second load, which also is delayed) and so instructions quickly pile up. About 20% through the fragment a branch is mispredicted (the misprediction is not discovered until the end of the fragment). About 25% through the reorder buffer fills and so fetching must wait for instructions to commit.