# EE 4720 Computer Architecture - HW 2 Solution (Spring 1998)

### Problem 1

 Execution is shown below, the diagram is wide so you'll have to scroll or maximize your browser window. Tree killers should remember that many browsers have a landscape option in their print command. Two instructions, `i1` and `i2`, are included after the branch. They never execute but they are fetched.
 Note that the execution of a single instruction uses just one line, and MEM is abbreviated to ME. Abandoned instruction are shown in gray.
```!Cycle  0  1  2  3  4  5  6  7	8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
add    IF ID EX ME WB
add       IF ID EX ME WB
Loop:
lw           IF ID ----> EX ME WB                         IF ID EX ME WB                         IF ID EX ME WB
add             IF ----> ID ----> EX ME WB                   IF ID ----> EX ME WB                   IF ID ----> EX ME WB
slt                      IF ----> ID ----> EX ME WB             IF ----> ID ----> EX ME WB             IF ----> ID ----> EX ME WB
addi                              IF ----> ID EX ME WB                   IF ----> ID EX ME WB                   IF ----> ID EX ME WB
bneq                                       IF ID -> EX ME WB                      IF ID -> EX ME WB                      IF ID -> EX ME WB

i1                                            IF -> ID EX ME WB                      IF -> ID EX ME WB                      IF -> ID EX ME WB
i2                                                  IF ID EX ME WB                         IF ID EX ME WB                         IF ID EX ME WB
```

### Problem 2

 First, certain values need to be assumed since they weren't provided in the problem, the values are in the table below.
 ` Location ` Assumed Value Comment ` LOOP ` 0x2000 Address of lw instruction. ` r3 ` 2 Branch condition register before being set. ` r4 ` 50 Sum limit. ` r10 ` 0x1000 Part of array address. ` r11 ` 0x10 Another part of array address. ` Mem[0x1010] ` 10 First element.
 The code fragment is shown below along with the register values that change in the first iteration:
```        !! r4 holds a limit
!! r5 holds the first array element address
add     r2, r0, r0    ! r2 = 0
add     r5, r10, rll  ! r5 = 0x1010
LOOP:
lw      r6, 0(r5)     ! r6 = 10
add     r2, r2, r6    ! r2 = 10
slt     r3, r2, r4    ! r3 = 1
addi    r5, r5, #4    ! r5 = 0x1014
bneq    r3, LOOP
```
 The code above doesn't show the state of the pipeline when `addi` is in the MEM stage, for example, r3 still contains the old value, not the value specified by the `slt` instruction. The register values are:
 ` Location ` Value Comment ` r2 ` 10 This register current. ` r3 ` 2 The "current" value in WB stage. ` r4 ` 50 Never changes. ` r5 ` 0x1010 The "current" value in the MEM stage. ` r6 ` 10 This register current.
 The pipeline latches:
 ` Latch ` Contents Comment ` IF.PC ` 0x2014 Address of instruction `i1` ` IF/ID.NPC ` 0x2014 Address of instruction `i1` ` IF/ID.IR ` `bneq r3, LOOP` ` ID/EX.** ` ?? Because of stall, EX contains no "real" instruction. ` EX/MEM.ALU OUT ` 0x1014 `addi` sum bound for r5 ` EX/MEM.B ` ?? ` MEM/WB.ALU OUT ` 1 `slt` condition bound for r3

### Problem 3

 The CPI can easily be found if each iteration of the loop executes the same way, as happens here after the first iteration. (Assuming no cache misses.) To find the CPI find the number of cycles separating two corresponding points in consecutive iterations. A convenient corresponding point for the code above is the cycle when `lw` is in the IF stage. This occurs at cycles 2, 17, and 30. Since iteration one is different than the others, and since it is clear that future iterations will look like iterations 2 and 3, the corresponding points in iterations 2 and 3 will be used. The number of cycles is 30-17=13, the number of instructions is 5, so the CPI is 2.6.  David M. Koppelman - koppel@ee.lsu.edu Modified 16 Apr 1998 17:52 (22:52 UTC)