EE 4720 Computer Architecture - HW 2 Solution (Spring 1998)
Problem 1
|
Execution is shown below, the diagram is wide so you'll have to scroll
or maximize your browser window. Tree killers should remember that
many browsers have a landscape option in their print command.
Two instructions, i1 and i2 , are included after the branch. They never
execute but they are fetched. | |
|
Note that the execution of a single instruction uses just one line, and
MEM is abbreviated to ME. Abandoned instruction are shown in gray.
| |
!Cycle 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
add IF ID EX ME WB
add IF ID EX ME WB
Loop:
lw IF ID ----> EX ME WB IF ID EX ME WB IF ID EX ME WB
add IF ----> ID ----> EX ME WB IF ID ----> EX ME WB IF ID ----> EX ME WB
slt IF ----> ID ----> EX ME WB IF ----> ID ----> EX ME WB IF ----> ID ----> EX ME WB
addi IF ----> ID EX ME WB IF ----> ID EX ME WB IF ----> ID EX ME WB
bneq IF ID -> EX ME WB IF ID -> EX ME WB IF ID -> EX ME WB
i1 IF -> ID EX ME WB IF -> ID EX ME WB IF -> ID EX ME WB
i2 IF ID EX ME WB IF ID EX ME WB IF ID EX ME WB
Problem 2
|
First, certain values need to be assumed since they weren't provided
in the problem, the values are in the table below. | |
Location | Assumed Value | Comment |
LOOP | 0x2000 | Address of lw instruction. |
r3 | 2 | Branch condition register before being set. |
r4 | 50 | Sum limit. |
r10 | 0x1000 | Part of array address. |
r11 | 0x10 | Another part of array address. |
Mem[0x1010] | 10 | First element. |
|
The code fragment is shown below along with the register values that change in
the first iteration: | |
!! r4 holds a limit
!! r5 holds the first array element address
add r2, r0, r0 ! r2 = 0
add r5, r10, rll ! r5 = 0x1010
LOOP:
lw r6, 0(r5) ! r6 = 10
add r2, r2, r6 ! r2 = 10
slt r3, r2, r4 ! r3 = 1
addi r5, r5, #4 ! r5 = 0x1014
bneq r3, LOOP
|
The code above doesn't show the state of the pipeline when addi
is in the MEM stage, for example, r3 still contains the old value, not the
value specified by the slt instruction. The register values are: | |
Location | Value | Comment |
r2 | 10 | This register current. |
r3 | 2 | The "current" value in WB stage. |
r4 | 50 | Never changes. |
r5 | 0x1010 | The "current" value in the MEM stage. |
r6 | 10 | This register current. |
Latch | Contents | Comment |
IF.PC | 0x2014 | Address of instruction i1 |
IF/ID.NPC | 0x2014 | Address of instruction i1 |
IF/ID.IR | bneq r3, LOOP | |
ID/EX.** | ?? | Because of stall, EX contains no "real" instruction. |
EX/MEM.ALU OUT | 0x1014 | addi sum bound for r5 |
EX/MEM.B | ?? | |
MEM/WB.ALU OUT | 1 | slt condition bound for r3 |
Problem 3
|
The CPI can easily be found if each iteration of the loop executes
the same way, as happens here after the first iteration. (Assuming no
cache misses.) To find the CPI find the number of cycles separating
two corresponding points in consecutive iterations. A convenient
corresponding point for the code above is the cycle when
lw is in the IF stage. This occurs at cycles 2, 17, and
30. Since iteration one is different than the others, and since it is
clear that future iterations will look like iterations 2 and 3, the
corresponding points in iterations 2 and 3 will be used. The number of
cycles is 30-17=13, the number of instructions is 5, so the CPI is
2.6. | |