EE 4720 Computer Architecture - HW 2 Solution (Spring 1998)

Problem 1

Execution is shown below, the diagram is wide so you'll have to scroll or maximize your browser window. Tree killers should remember that many browsers have a landscape option in their print command. Two instructions, i1 and i2, are included after the branch. They never execute but they are fetched.

Note that the execution of a single instruction uses just one line, and MEM is abbreviated to ME. Abandoned instruction are shown in gray.

!Cycle  0  1  2  3  4  5  6  7	8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
add    IF ID EX ME WB
add       IF ID EX ME WB
Loop: 
lw           IF ID ----> EX ME WB                         IF ID EX ME WB                         IF ID EX ME WB   
add             IF ----> ID ----> EX ME WB                   IF ID ----> EX ME WB                   IF ID ----> EX ME WB
slt                      IF ----> ID ----> EX ME WB             IF ----> ID ----> EX ME WB             IF ----> ID ----> EX ME WB
addi                              IF ----> ID EX ME WB                   IF ----> ID EX ME WB                   IF ----> ID EX ME WB
bneq                                       IF ID -> EX ME WB                      IF ID -> EX ME WB                      IF ID -> EX ME WB

i1                                            IF -> ID EX ME WB                      IF -> ID EX ME WB                      IF -> ID EX ME WB 
i2                                                  IF ID EX ME WB                         IF ID EX ME WB                         IF ID EX ME WB

Problem 2

First, certain values need to be assumed since they weren't provided in the problem, the values are in the table below.

`Location`	Assumed Value	Comment
`LOOP`	0x2000	Address of lw instruction.
`r3`	2	Branch condition register before being set.
`r4`	50	Sum limit.
`r10`	0x1000	Part of array address.
`r11`	0x10	Another part of array address.
`Mem[0x1010]`	10	First element.

The code fragment is shown below along with the register values that change in the first iteration:

        !! r4 holds a limit
        !! r5 holds the first array element address
        add     r2, r0, r0    ! r2 = 0
        add     r5, r10, rll  ! r5 = 0x1010
LOOP:
        lw      r6, 0(r5)     ! r6 = 10
        add     r2, r2, r6    ! r2 = 10
        slt     r3, r2, r4    ! r3 = 1
        addi    r5, r5, #4    ! r5 = 0x1014
        bneq    r3, LOOP

The code above doesn't show the state of the pipeline when addi is in the MEM stage, for example, r3 still contains the old value, not the value specified by the slt instruction. The register values are:

`Location`	Value	Comment
`r2`	10	This register current.
`r3`	2	The "current" value in WB stage.
`r4`	50	Never changes.
`r5`	0x1010	The "current" value in the MEM stage.
`r6`	10	This register current.

The pipeline latches:

`Latch`	Contents	Comment
`IF.PC`	0x2014	Address of instruction `i1`
`IF/ID.NPC`	0x2014	Address of instruction `i1`
`IF/ID.IR`	`bneq r3, LOOP`
`ID/EX.**`	??	Because of stall, EX contains no "real" instruction.
`EX/MEM.ALU OUT`	0x1014	`addi` sum bound for r5
`EX/MEM.B`	??
`MEM/WB.ALU OUT`	1	`slt` condition bound for r3

Problem 3

The CPI can easily be found if each iteration of the loop executes the same way, as happens here after the first iteration. (Assuming no cache misses.) To find the CPI find the number of cycles separating two corresponding points in consecutive iterations. A convenient corresponding point for the code above is the cycle when lw is in the IF stage. This occurs at cycles 2, 17, and 30. Since iteration one is different than the others, and since it is clear that future iterations will look like iterations 2 and 3, the corresponding points in iterations 2 and 3 will be used. The number of cycles is 30-17=13, the number of instructions is 5, so the CPI is 2.6.

David M. Koppelman - koppel@ee.lsu.edu

Modified 16 Apr 1998 17:52 (22:52 UTC)