## **EE 4720**Homework 2**Due:** t,where $t = \begin{cases} 4 \text{ October 1999,} & \text{if 29 September class held;} \\ 6 \text{ October 1999,} & \text{if 29 September class cancelled;} \\ 8 \text{ October 1999,} & \text{if 29 September and 1 October class cancelled.} \end{cases}$ Due: t,

**Problem 1:** Suppose the coding of DLX instructions were changed so the destination appeared before the source operands, as shown in the codings below:



Show the changes needed to the pipeline below to implement this new ISA. The changes should only effect the ID and WB stages. If there are differences in the control inputs to multiplexors or other units, explain what those differences are. Make sure that your design executes store instructions correctly.



1

**Problem 2:** The program below executes on the DLX implementation shown below. The implementation uses forwarding (bypassing) to avoid some data hazards and stalls to avoid others. All forwarding paths are shown. (If a needed forwarding path is not there, sorry, you'll have to stall.) A value can be read from the register file in the same cycle it is written. The destination field in the beqz is zero. Instructions are nulled (squashed) in this problem by replacing them with slt r0,r0,r0. All instructions stall in the ID stage.



The table below shows the contents of pipeline registers and changes to architecturally visible registers r1-r31 over time. Cycle zero is the time that 1b is in instruction fetch. The first two columns are completed; fill in the rest of the table. Use a "?" for the value of the "immediate field" of a type R instruction and for the output of the memory when no memory read is performed. Show pipeline register values even if they're not used. Assume that the ALU performs the branch target computation even though it was already computed in ID. The row labeled "Reg. Chng." shows a new register value that is available at the *beginning* of the cycle. If no register value is written leave the entry blank.

Hints: See Spring 1999 HW 3 for a similar problem. One feature of the solution would not be present if lb were replaced by a addi. Another feature may not be present if lb were replaced by lw.

| Cycle      | 0                              | 1                              | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|------------|--------------------------------|--------------------------------|---|---|---|---|---|---|---|---|----|
| PC         | 0×50                           | 0x54                           |   |   |   |   |   |   |   |   |    |
| IF/ID.IR   | sub                            | Ib                             |   |   |   |   |   |   |   |   |    |
| Reg. Chng. | $\texttt{r0} \gets \texttt{0}$ | $\texttt{r0} \gets \texttt{0}$ |   |   |   |   |   |   |   |   |    |
| ID/EX.IR   | sub                            | sub                            |   |   |   |   |   |   |   |   |    |
| ID/EX.A    | 0                              | 0                              |   |   |   |   |   |   |   |   |    |
| ID/EX.B    | 0                              | 0                              |   |   |   |   |   |   |   |   |    |
| ID/EX.IMM  | ?                              | ?                              |   |   |   |   |   |   |   |   |    |
| EX/MEM.IR  | sub                            | sub                            |   |   |   |   |   |   |   |   |    |
| EX/MEM.ALU | 0                              | 0                              |   |   |   |   |   |   |   |   |    |
| EX/MEM.B   | 0                              | 0                              |   |   |   |   |   |   |   |   |    |
| MEM/WB.IR  | sub                            | sub                            |   |   |   |   |   |   |   |   |    |
| MEM/WB.ALU | 0                              | 0                              |   |   |   |   |   |   |   |   |    |
| MEM/WB.MD  | ?                              | ?                              |   |   |   |   |   |   |   |   |    |

Problem 3: Consider the program: LOOP: lw r1, 0(r2) add r3, r1, r3 addi r2, r2, #4 bneq r1, LOOP or r4, r5, r6

For each implementation below provide a pipeline execution diagram showing execution up to the third fetch of lw and determine the CPI for a large number of iterations.



**Problem 4**: Schedule (rearrange) the instructions in the program used in the previous problem to improve execution speed. (Do not change what the program does!). Show pipeline execution diagrams and determine CPI for the two implementations.

**Problem 5:** Show the changes needed to implement the predicated instructions presented in class. (Set 4, page 25, as of this writing.) Describe the instruction format and show any datapath and control changes to the implementation below.

