From: "Saved by Windows Internet Explorer 7" Subject: Control logic for Page 85 of Datapath Date: Sun, 4 Oct 2009 22:12:45 -0600 MIME-Version: 1.0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Location: http://www.ece.lsu.edu/alex/EE3755/fall05/CONTROL.HTM X-MimeOLE: Produced By Microsoft MimeOLE V6.0.6001.18049 Control logic for = Page 85 of Datapath

## = LSU EE=20 3755 -- -- Computer Organization

#

## = Control=20 Logic for MIPS -- fall 2006

 

##=20 Contents

#

# Single Cycle = control=20 logic for the Datapath (page 85)

# Hard wired = and Micro=20 Programmed Controller

# Multi Cycle = control=20 logic

# Finite State = Diagram=20 and Finite State Machine

# Programmable = Logic=20 Array

 

 

 

# Single Cycle = control=20 logic for the Datapath (page 85)

 

 

Control logic for Page 85 of=20 Datapath..

 

We simply have to give a control signal for = each=20 Multiplexor , the ALU

and design logic for control circuit = near  the = PC.

 

1) the ALU

2) Mux at ALU

3)Mux at =20 data Memory and ALU.

4)Mux at Rd for =20 Register file

5)Mux at line 15-11 for Register=20 file

6)Mux at D/IN for Register=20 file

7) Mux and control for = NPC

8)Mux for the control after the=20 ALU

 

1) the ALU

      ALU = can=20 perform

           &n= bsp;         =20 AND,

           &n= bsp;         =20 OR,

           &n= bsp;         =20 SLT,

           &n= bsp;         =20 ADD,

           &n= bsp;         =20 SUB

           &n= bsp;         =20 operations.

      We = will generate=20 those control signals from

      OP = code=20 field[31:26] and func field[5:0]

        = We=20 have control inputs for ALU.

           &n= bsp;=20

           =20

         =20 ALU_OP[2:0]           =20 Operation

           &n= bsp;=20 000           &n= bsp;           &nb= sp;   =20 AND

           &n= bsp;=20 001           &n= bsp;           &nb= sp;   =20 OR

           &n= bsp;=20 010           &n= bsp;           &nb= sp;   =20 SLT

           &n= bsp;=20 011           &n= bsp;           &nb= sp;  =20 ADD

           &n= bsp;=20 100           &n= bsp;           &nb= sp;  =20 SUB

 

           &n= bsp;       =20 OP      =20 FUN    =20 ALUoperation  = :      = OP           &n= bsp; =20 FUN        =20 ALU_OP

   =20 AND         =20 0        =20 0X 24           =20 AND           &n= bsp;  =20 000000   = 100100           &n= bsp;   =20 000

   =20 ANDI        =20 6          =20 X           &n= bsp;     =20 AND           &n= bsp; =20 000110   = XXXXXX           &n= bsp;=20 000

   =20 OR           =20 0         =20 0x25           &n= bsp; =20 OR           &n= bsp;   =20 000000   = 100101           &n= bsp;   =20 001

   =20 ORI          =20 d         =20   X           &n= bsp;    =20 OR           &n= bsp;    =20 001101   = XXXXXX           &n= bsp;=20 001

   =20 SLT          =20 0         =20 0x2a           &n= bsp;=20 SLT           &n= bsp;  =20 000000   = 101010           &n= bsp;    =20 010  =20

   =20 SLTI          =20 a         =20 X           &n= bsp;      =20 SLT           &n= bsp;  =20 001010   = XXXXXX           &n= bsp; =20 010

   =20 ADD          =20 0         =20 0X20           &n= bsp; =20 ADD           &n= bsp;=20 000000   = 100000           &n= bsp;    =20 011

   =20 ADDI         =20 8           =20 X           &n= bsp;   =20 ADD           &n= bsp; =20 001000   = XXXXXX           &n= bsp; =20 011

   =20 SUB          =20 0         =20 0X22      =20        SUB           &n= bsp; =20 000000   =20 100010           &n= bsp;   =20 100

   =20 LW           &n= bsp;=20 23          =20 X           &n= bsp;     =20 ADD           &n= bsp; =20 100011   = XXXXXX           &n= bsp; =20 011

   =20 LB           &n= bsp; =20 20          =20 X           &n= bsp;     =20 ADD           &n= bsp; =20 100000   = XXXXXX          =20    011

  =20 SW           &n= bsp;=20 2b         =20 X           &n= bsp;     =20 ADD           &n= bsp;  =20 100000   = XXXXXX           &n= bsp; =20 011

  =20 SB           &n= bsp; =20 28         =20 X           &n= bsp;     =20 ADD           &n= bsp;  =20 101000   = XXXXXX           &n= bsp; =20 011

  =20 BEQ           =20 4          =20 X           &n= bsp;    =20  SUB           &n= bsp;   =20 000100  XXXXXX           &n= bsp; =20 100

  =20 BNE           =20 5          =20 X           &n= bsp;     =20 SUB           &n= bsp;   =20 000101  XXXXXX           &n= bsp; =20 100

 =20

           &n= bsp;=20 By using  AND and = OR gates,=20 We could easily generate ALU_OP.

           &n= bsp;=20 After logic minimization, the output will be=20 simpler.

 =20 ##############################################################         =20

           &n= bsp;=20 There are many other ways to do this.

           &n= bsp;=20 One simple way is using decoders to decode OP code and = function 

           &n= bsp;=20 field.

           =20

           &n= bsp;  =20 6 inputs and 64 outputs for each = decoder.

           &n= bsp;=20 000           &n= bsp;           &nb= sp;   =20 AND

           &n= bsp;=20 001           &n= bsp;           &nb= sp;   =20 OR

           &n= bsp;=20 010           &n= bsp;           &nb= sp;   =20 SLT

           &n= bsp;=20 011     =20            &n= bsp;         ADD=

           &n= bsp;=20 100           &n= bsp;           &nb= sp;  =20 SUB =20

         =20 ALU_OP[2]  =3D   x0 y34  + =20 x4 + x5

         =20 ALU_OP[1]  =3D  slt  add   (0 2a)  + a + (0 20) + 8 + 23 + 20 +2b = + 28=20

           &n= bsp;           &nb= sp;        =20 =3D =20 x0y42+x10+x0y32+x8+x35+x32+x43+x40

         =20 ALU_OP[0]  =3D  or add   (0 25) + d+  (0 20) + 8 + 23 + 20 +2b + 28=20

           &n= bsp;           &nb= sp;        =20 =3D   = x0y37+x13+  =20 x0y32+x8+x35+x32+x43+x40

 =20

 

         =20

  = ##           &n= bsp; =20 Fig. for Decoders

 

 #######################################################= ############           &n= bsp; =20

           &n= bsp; =20 second simple way is using a ROM.

           &n= bsp;   =20 the address for the ROM is OP code and funcion field and the = ouput is=20 ALU_OP.

 

 

##           &n= bsp;  =20 Fig. for ROM

 

##################################################################= ##

2) Mux at ALU

        =20 the Mux will select D/rt or sign extended=20 immed.

        =20 if select for the Mux is 0, it will select D/rt otherwise select = sign=20 extended immed.

        =20 We call the select signal Mux_ALU_CNT.

        =20 ANDI,ORI,SLTI,ADDI,LW,LB,SW,SB use

           &n= bsp;    =20 immediate values.

 

 

 

 

 

           =20 OP   FUN     = ALUoperation  :      = OP       = FUN        =20 Mux_ALU_CNT

       =20

ANDI     6       = X         =20 AND           &n= bsp;       =20 000110   = XXXXXX          =20 1

 ORI      d       = X          =20 OR           &n= bsp;        =20 001101   = XXXXXX          =20 1

 SLTI    a       =20 X          =20 SLT           &n= bsp;      =20 001010   = XXXXXX           =20 1

 ADDI   8       =20 X         =20 ADD           &n= bsp;      =20 001000   = XXXXXX           =20 1

 LW     23       X        =20 ADD           &n= bsp;       =20 100011   = XXXXXX           =20 1

 LB      = 20       = X        =20 ADD           &n= bsp;      =20 100000   = XXXXXX           =20 1

 SW     2b       = X       =20 ADD           &n= bsp;       =20 100000   = XXXXXX           =20 1

 SB     28       =20 X       =20 ADD    =            &n= bsp;    101000   XXXXXX           =20 1

 

## We could implement the logic with AND = and OR=20 gates.

## We could use a decoder with OR gate to = implement=20 this.

## We could use a ROM with 6bit address = field. 

 

3)Mux at =20 data Memory and ALU.

        =20 the Mux will select ALU_output or Dout.

        =20 if select for the Mux is 0, it will select ALU_output=20 ,

        =20 otherwise select Dout.

        =20 We call the select signal Mux_DataMem.

        =20 Only LW and LB will set Mux_DataMem.

 

        =20 OP    =20 FUN    =20 ALUoperation  = :      OP      = FUN        =20 Mux_DataMem

 LW   23          =20 X          =20 ADD           &n= bsp; =20 100011   = XXXXXX           =20 1

 LB    20          =20 X      =20     ADD           &n= bsp; =20 100000   = XXXXXX           =20 1

         =20

## We could implement the logic with AND = and OR=20 gates.

## We could use a decoder with OR gate to = implement=20 this.

## We could use a ROM with 6bit address = field. 

 

4)Mux at Rd for =20 Register file

        =20 the Mux will select  = "#31"=20 output of another Mux.

  =20       if select=20 for the Mux is 0, it will select output of another Mux=20

           &n= bsp;=20 ,otherwise select "#31".

        =20 We call the select signal Mux_Rd_CNT.

        =20 Only Jal will set Mux_Rd_CNT.

 

          =20 OP    =20 FUN    =20 ALUoperation  = :      = OP     FUN        =20 Mux_Rd_CNT

 =20 JAL    = 3          =20 X           &n= bsp;     =20 X           &n= bsp; =20 000011   = XXXXXX           =20 1

      ## We = could=20 implement the logic with AND and OR gates.

      ## We = could use=20 a decoder  to implement=20 this.

 

5)Mux at line 15-11 for Register=20 file

        =20 the Mux will select  = lines 15=20 to 11 or lines 20 to 16.

        =20 if select for the Mux is 0, it will select lines 15 to 11=20 ,

           =20 otherwise select lines 20 to 16.

        =20 We call the select signal Mux_Line11.

        =20 Whenever RT field acts like RD field, we will set=20 Mux_Line11.

       =20 Whenever they try to write something,

           &n= bsp; =20 they need Rd field(some times Rt) .         =20

 

           &n= bsp;=20 OP      =20 FUN    =20 ALUoperation  = :      = OP     FUN        =20 Mux_Line11.          =20

ANDI      6           =20 X           &n= bsp;     =20 AND      =20 000110   = XXXXXX           &n= bsp;=20 1          =20

ORI       =20 d           =20 X           &n= bsp;      =20 OR        =20 001101   = XXXXXX           &n= bsp;=20 1

SLTI       = a           =20 X           &n= bsp;      =20 SLT      =20 001010   = XXXXXX        =20      1

ADDI      8           =20 X           &n= bsp;     =20 ADD      =20 001000   = XXXXXX           &n= bsp; =20 1

LW       =20 23          =20 X           &n= bsp;     =20 ADD      =20 100011   = XXXXXX           &n= bsp; =20 1

LB       =20 20          =20 X           &n= bsp;     =20 ADD       =20 100000   = XXXXXX           &n= bsp; =20 1

 

   =20 ## We could implement the logic with AND and OR=20 gates.

   =20 ## We could use a decoder with OR gate to implement=20 this.

   =20 ## We could use a ROM with 6bit address field. 

 

6)Mux at D/IN for Register=20 file

        =20 the Mux will select  = either=20 output from Mux 3(control signal for this is

        =20 Mux_dataMem) or lines from NPC +4(PC = +8).

        =20 if select for the Mux is 0, it will select output from Mux3,=20

         =20 otherwise select  = lines from=20 NPC + 4.

        =20 We call the select signal = Mux_Line_NPC_plus4.

       =20 Whenever we have to save NPC + 4, we will set=20 Mux_Line_NPC_plus4.

         =20

        = Only=20 Jal and Jalr will set Mux_Line_NPC_plus4.

    =20

       =20 OP   FUN  ALUoperation  : OP    FUN   ALU_OP:=20 Mux_Line_NPC_plus4.

 jal     3       = X           &n= bsp;=20    X       =20 000011  XXXXXX       = X           &n= bsp;        =20 1

 jalr   0       =20 9           &n= bsp;   =20 X       =20 000000  001001        =20 X           &n= bsp;        =20 1  =

 =20

   =20 ## We could implement the logic with AND and OR=20 gates.

   =20 ## We could use two decoders with OR gate and AND gate=20

          =20 to implement this.

   =20 ## We could use a ROM with 12bit address field. 

 

7) Mux and control for = NPC

   =20 We have to design control logic for =20 NPC_CNT

           =20 (controller before the Mux).

   =20 The controller will  = do one=20 of four things.

           =20 1)check if  branch = condition=20 is met  and if the = instruction is=20 branch

           &n= bsp;   =20 instruction, it will generate select signal =

           &n= bsp;   =20 for the Mux(Mux_NPC_CNT =3D00).

           &n= bsp;   =20 the mux will select branch address(Target address at the=20 figure).

           =20 2) check if the instruction is jump instruction,=20

           &n= bsp;    =20 it will generate select signal

           &n= bsp;   =20 for the Mux(Mux_NPC_CNT =3D01).

           &n= bsp;   =20 the mux will select jump address.

           =20 3)check if the instruction is jr or jalr=20 instruction,

           &n= bsp;   =20 it will generate select signal

           &n= bsp;   =20 for the Mux(Mux_NPC_CNT =3D10).

           &n= bsp;   =20 the mux will select jump address(Drs from the register=20 file).

   =20         4)otherwise, = either branch condition failed or regular = instruction.

           &n= bsp;    =20 it will generate select signal

           &n= bsp;   =20 for the Mux(Mux_NPC_CNT =3D11).

           &n= bsp;   =20 the mux will select NPC +4 address = lines.

 

           &n= bsp;   =20 The part of inputs for this controller come from=20

           &n= bsp;     =20 Control Signal which tell  = the instruction is

           &n= bsp;     =20 branch instruction(NPC_CNT_SIG =3D 00) , or =

           &n= bsp;    =20 the instruction is jump or jal =20 instruction(NPC_CNT_SIG =3D 01), or

       =           t= he=20 instruction is jr or jalr instruction(NPC_CNT_SIG =3D=20 10).

           &n= bsp;    =20 the instruction is regular instruction(NPC_CNT_SIG =3D=20 11).

 

           &n= bsp;   =20 The other part of inputs for this controller come=20

           &n= bsp;     =20 from Mux 8 which tells  = the=20 branch condition met or

           &n= bsp;           &nb= sp; =20 not(Mux8_output =3D 1 ; branch condition met). =

     =20

 

The Control Signal should generate=20 NPC_CNT_SIG.

           &n= bsp;=20

           &n= bsp;      =20 OP      =20 FUN    =20 ALUoperation  = :      = OP           &n= bsp; =20 FUN       =20 NPC_CNT_SIG[1:0]

           =20

     BEQ       = 4        =20 X           &n= bsp;     =20 SUB           &n= bsp;   =20 000100   =20 XXXXXX           &n= bsp;=20 00

     BNE       = 5        =20 X           &n= bsp;     =20 SUB           &n= bsp;   =20 000101   =20 XXXXXX          =20   00=20

     J           =20 2        =20 X           &n= bsp;        =20 X           &n= bsp;      =20 000010    =20 XXXXXX           &n= bsp;=20 01

     jal         =20 3        =20 X           &n= bsp;        =20 X           &n= bsp;       =20 000011    =20 XXXXXX           &n= bsp;=20 01

     Jr         =20 0         =20 8           &n= bsp; =20        X           &n= bsp;       =20 000000    =20 001000           &n= bsp;  =20 10

     jalr       =20 0         =20 9           &n= bsp;        =20 X           &n= bsp;       =20 000000    =20 001001           &n= bsp;  =20 10          =20

   =20

   =20 ## We could implement the logic with AND and OR=20 gates.

  =20 ## We could use two decoders with Inverters and AND gates  to

   =20 # implement this.

   =20 ## We could use a ROM with 12bit address field. 

 

Now we need to generate signals  = MuX_NPC_CNT.

           =20 1)check if  branch = condition=20 is met  and if the = instruction is=20 branch

           &n= bsp;   =20 instruction, it will generate select signal for=20 the

           &n= bsp;   =20 Mux(Mux_NPC_CNT =3D00).

           &n= bsp;   =20 the mux will select branch address(Target address at the=20 figure).

           &n= bsp; =20  if  NPC_CNT_SIG  =3D00 and Mux_output =3D 1 , = then=20

           &n= bsp;=20      Mux_NPC_CNT[1:0]=20 =3D00.

           &n= bsp;           &nb= sp;=20

 

           =20 2) check if the instruction is jump instruction, it will generate = select=20

           &n= bsp;      =20 signal  for the=20 Mux(Mux_NPC_CNT =3D01).

           &n= bsp;       =20 the mux will select jump address.

           &n= bsp;          =20 if NPC_CNT_SIG =3D01 then Mux_NPC_CNT[1:0] =3D = 01.

 

           =20 3)check if the instruction is jr instruction, it will generate = select=20

           &n= bsp; =20       signal    for the = Mux(Mux_NPC_CNT=20 =3D10).

           &n= bsp;   =20 the mux will select jump address(Drs from the register=20 file).

           &n= bsp;         =20 if NPC_CNT_SIG =3D10 then Mux_NPC_CNT[1:0] =3D = 10.

 

           =20 4)otherwise, either  = regular=20 instruction,   = or    branch instruction = and=20

           &n= bsp;    =20 branch condition failed =20 .

           &n= bsp;    =20 it will generate select signal

           &n= bsp;    =20 for the Mux(Mux_NPC_CNT =3D11).

           &n= bsp;    =20 the mux will select NPC +4 address lines. 

           &n= bsp;      =20

          =20           i= f=20 NPC_CNT_SIG =3D11 or

           &n= bsp;           =20 NPC_CNT_SIG  =3D00 = and=20 Mux_output =3D 0 ,

           &n= bsp;           &nb= sp;=20 then  = Mux_NPC_CNT[1:0]=20 =3D11.

################################################################

 

   =20 ## We could implement the logic with AND and OR=20 gates.

   =20 ## We could use two decoders with OR gate and AND gate=20

     ## to = implement=20 this.

   =20 ## We could use a ROM with 12bit address field. 

 

     The logic is = simple,=20 so try to implement with gates.

 

 

           &n= bsp;           &nb= sp;           &nbs= p;  =20 Mux_NPC_CNT[0]   = =3D=20 SIG[0]   +  MUX' = *SIG[1]'

 

           &n= bsp;          =20 SIG           &n= bsp;  =20 00    01   11    1=20 0

           &n= bsp;           &nb= sp;           &nbs= p; =20 0      = 1      = 1      = 1      = 0

           &n= bsp;          =20 MUX     =20 1       =20 0     = 1      1     =20 0

           &n= bsp;           &nb= sp;           &nbs= p;       =20

           &n= bsp;          =20

           &n= bsp;           &nb= sp;           &nbs= p;  =20 Mux_NPC_CNT[1] =3D SIG[1 ]  =20 +  MUX'=20 *SIG[0]'

 

           &n= bsp;          =20 SIG           &n= bsp; =20 00    = 01   11   1 = 0

           &n= bsp;         =20            &n= bsp;    0     1       = 0    1    =20 1

           &n= bsp;          =20 MUX     =20 1     = 0       = 0    1      1

       =20

 

 

8)Mux for the control after the=20 ALU

 

        =20 the Mux will select  = either=20 zero  from ALU or=20 zero'.

        =20 if select for the Mux is 0, it will select zero, otherwise=20 select

        =20 zero'.

        =20 We call the select signal Mux_Branch.

       =20 Whenever  we have = BEQ, Mux=20 will select zero otherwise , select zero'.

 

           &n= bsp;  =20 OP      =20 FUN    =20 ALUoperation  = :      = OP           &n= bsp; =20 FUN       =20 Mux_Branch

           =20

   =20 BEQ     = 4          =20 X           &n= bsp;     =20 SUB           &n= bsp;   =20 000100   =20 XXXXXX           &n= bsp;=20 0

   =20 BNE     = 5          =20 X           &n= bsp;     =20 SUB           &n= bsp;   =20 000101   =20 XXXXXX           &n= bsp;=20 1

################################################################

## We could implement the logic with AND = gate and=20 Inverters.

## We could use a decoder to implement=20 this.

         =20 this is just output  = x5 of OP=20 code decoder.

 

9)R/W signal for Data=20 memory.

      R/W = =3D 0 means=20 read.

      R/W = =3D 1 means=20 write.

           &n= bsp;      =20 OP      =20 FUN    =20 ALUoperation  = :      = OP           &n= bsp; =20 FUN           =20    R/W

       =20 LW     =20 23          =20 X           &n= bsp;     =20 ADD           &n= bsp; =20 100011   = XXXXXX           &n= bsp; =20 0

       =20 LB      =20 20          =20 X           &n= bsp;     =20 ADD           &n= bsp; =20 100000   = XXXXXX           &n= bsp; =20 0

       =20 SW     =20 2b         =20 X           &n= bsp;     =20 ADD           &n= bsp;  =20 100000   = XXXXXX           &n= bsp; =20 1

       =20 SB      =20 28         =20 X           &n= bsp;     =20 ADD           &n= bsp;  =20 101000   = XXXXXX           &n= bsp; =20 1

 

## We could use a decoder to implement=20 this.

         =20 this is  ( x43 + = x40) of OP=20 code decoder.

       =20

 

# Multi Cycle = control=20 logic 

 

###  =  Hard Wired and = Micro=20 programmed Controller ##

 

 

We have a high = level  language program(c program=20 ..)

We could compile = it to=20 Mips assembly language instruction(compiler).

We could translate assembly language to Mips = machine=20 language(assembler).

 

(c program = =3D> assembly=20 language =3D> machine language).

 

For = a CPU with=20 a hardwired controller:

 each machine = language=20 instruction is  decoded = and executed=20 like a finite state machine.

 

For  a CPU with a micro = programmed  controller:

 

Each = machine=20 language instruction is  = defined by=20 a set of microinstructions and  = each=20 microinstruction is decoded and

Executed by = micro=20 sequencer  (Micro=20 CPU)

 

 

## Multi Cycle=20 Implementation####

 

##### Performance=20 ###################################

 =20

 So far,=20 we assume we could finish every instruction in a  single cycle, which means = we  have to finish every = instruction in a=20 fixed time.

That means we have to set the clock = frequency to the=20 slowest instruction.

Usually memory access instruction or = floating point=20 instruction is

  =20  the slowest=20 one.

 

Although the CPI(cycle per = instruction)  is 1, the overall performance = of a=20 single-cycle implementation is not likely to be very good, since several = of the=20 instruction classes could fit in a shorter clock cycle. 

 

 One way=20 to solve the problem of single cycle is :

       =20 Break = the=20 instruction into smaller steps

        = When=20 we break the instruction make  = sure=20 all the steps

           &n= bsp;=20 to have similar length(this is = important)

       =20 Execute each=20 step (instead of the entire instruction) in one = cycle

       =20 Cycle time: time it takes to execute the longest=20 step

       =20

 The advantages of the = multiple cycle=20 processor:

       =20 Cycle = time is=20 much shorter

       =20 Different=20 instructions take different number of number of steps =

          =20 to  complete=20

           =20

           &n= bsp;=20   We have = examples of=20 cylces below.

       =20 Use = ALU more=20 than once per instruction

     =20

       =20 Question:  What = will be one=20 instruction which uses ALU

           &n= bsp;  =20            &n= bsp;    more=20 than once per instruction?

       =20 Answer:  =20

 

######Execution time =20 ####################

We have to think execution time=20 equation.

       = Execution=20 time =3D Instructions/Program  = * Clock=20 cycles/Instruction

           &n= bsp;           &nb= sp;    =20           *= =20 Seconds/Clock cycle

## Performance of single cycle = machine.    =20

 

           =20 one program consists of  = 24%=20 loads, 12%stores,

           &n= bsp;=20 44% ALU instructions,18% branches

           =20 and 2% jumps( Instruction mix).

       =20

           =20 Assume operation times for the major functional units are=20 following:

           =20 Memory units: 2 ns(nano seconds).

           =20 ALU and adders: 2 ns.

           =20 Register file( read and write): 1ns. 

 

 

 

     = Instruction 

       = class         =20            &n= bsp;Fuctional=20 units used by the instruction class

(IF : instruction fetch; MEM: Memory = access;)           &n= bsp;           &nb= sp;        =20

     ALU = type          =20 IF:    = Register=20 access :    ALU = :     Register=20 access

   =20 Load           &n= bsp;      =20 IF:    = Register=20 access:     = ALU:     MEM:  Register=20 access

   =20 Store           &n= bsp;     =20 IF:    = Register=20 access:     = ALU:     MEM: 

   =20 Branch           &n= bsp;  =20 IF:    = Register=20 access:     = ALU:    =20

   =20 Jump           &n= bsp;    =20 IF:   =20

 

  =20 We will compute the required time =20 for each instruction class

 

 Instruction =

       = class      = instruction    register     ALU       =20 Data           =20 Register

           &n= bsp;          =20 memory        =20 read      =20 operation   = memory     write         =20 total

   =20 ALU type       = 2           &n= bsp;       =20 1             &n= bsp;   2           &n= bsp;  =20 0           &n= bsp;        =20 1      =20  6ns

   =20 Load           &n= bsp;  =20 2           &n= bsp;       =20 1           &n= bsp;    =20 2           &n= bsp;  =20 2           &n= bsp;        =20 1     =   8ns

   =20 Store           &n= bsp; =20 2           &n= bsp;       =20 1           &n= bsp;    =20 2           &n= bsp;   =20 2           &n= bsp;        =20 0    =   7ns

   =20 Branch         =20 2           &n= bsp;       =20 1           &n= bsp;    =20 2           &n= bsp;   =20 0           &n= bsp;        =20 0     =   5ns

   =20 Jump           &n= bsp;=20 2           &n= bsp;       =20 0           &n= bsp;    =20 0           &n= bsp;  =20 0           &n= bsp;         =20 0     =   2ns

 

 

Question:    What will the = clock cycle=20 for a single clock cycle machine?

 

## Answer:

 

#################################################

A machine with a variable clock will have a = clock=20 cycle that varies

 between=20 2ns and 8ns.

What will the average time per=20 instruction?

           =20 CPU clock cycle =3D 8*24% + 7 *12%+ 6*44%+=20 5*18%+2*2%

           &n= bsp;           &nb= sp;           &nbs= p;       =20 =3D 6.6ns.       =20

A little bit fast but  not much=20 difference.

 

 

##Performance of a single -cycle cpu with = floating=20 point instructions#####

 

loads comprise =20 31% of the instructions.    =

stores comprise 21% of the = instructions.   =

R format instructions comprise 27% of the=20 mix

Branches comprise = 5%

Jumps comprise 2%

FP add and subtract take the same time and = together=20 total 7% of

           &n= bsp;=20 the instructions

FP multiply and divide take the same time = and=20 together total 7% of

           =20 the instructions

 

Assume we have a floating point unit that = takes 8ns=20 for addition

          =20 and 16ns for multiply.

 

##

Question: How long will it take for the = floating=20 point add?

Answer:

##

Question: How long will it take for the = floating=20 point multiply?

Answer:

 

 

CPU clock cycle =3D8*31% + 7 *21%+ 6*27%+ = 5*5%+2*2%=20 +12*7%+20*7%

           &n= bsp;           &nb= sp;       =20 =3D 8ns.

 

 

Improved performance  =3D =20 20/8

 

## This is part of the reason we have to = make=20 multi-cycle implementation.

###############################################<= /B>

 

Go back to page 85=20 datapath.

 

Question: Is there any way to reduce=20 H/W?

Answer:

 

 

#########################

We have two memories : we will combine two = memories=20

           &n= bsp;           &nb= sp; =20 into a single memory unit for both instructions and data.=20

We have two ALUs: we will combine two ALUs = into a=20 single ALU.

 

##

Question : What kind of H/W do we need = after making=20 the modification.

Answer:  =20

 

 

##############

New data Path. =20 Figure 5.33 .page 383.

##############

1)Instruction fetch = step

2)Instruction decode and register fetch=20 step

3)Execution, memory address computation, or = branch=20 completion

   =20 (change the PC with branch target = address)

4)Memory access or R type instruction = completion=20 step

5)Memory read completion=20 step

 

 

1)Instruction fetch = step

       = IR =3D=20 Memory[PC];

       = PC =3D PC=20 +4;

 

       = IR  : Instruction=20 Register.

 

2)Instruction decode and register fetch=20 step

      A =3D=20 GPR[IR[25-21]];

      B =3D=20 GPR[IR[20-16]];

      ALUOUT = =3D PC +4 +=20 (sign-extend(IR[15-0]) << 2);

 

3)Execution,memory address computation, or = branch=20 completion

   =20 (change the PC with branch target = address)

   =20 Memory reference:

       =20     ALUOut = =3D A +=20 sign extend(IR[15-0]);

   =20 Arithmetic logical instruction(R type)

   =20        =20 ALUOut =3D A op B;

  =20 Branch:

  =          =20 if (A =3D B=3D) PC =3D ALUOut;

  =20 Jump: =20

        =20 PC =3D PC[31-28] || (IR[25-0] << = 2);

4)Memory access or R type instruction = completion=20 step

  =20 Memory reference:

           =20 MDR =3D Memory [ALUOut];

       = or=20

          =20 Memory [ ALUOut] =3D B;

  =20 Arithmetic logical instruction(R type)

         =20 GPR[IR[15-11]] =3D ALUOut;

 

 

5)Memory read completion=20 step

  =20 Load

       =20 GPR[IR[15-11]] =3D MDR;

 

 =20 ############## Control for Each State=20 ###############

 

Finite State = Machine(FSM).

      FSM is = used to=20 specify the multi cycle control.

      = FSM  consists of a set of states = and=20 directions on how to change

     =20 states.

 

FSM  = review.

       =20 Finite = state=20 machines:

        = a=20 set of states and internal storage

       =20 next = state=20 function (determined by current state and the = input)

       =20 output = function=20 (determined by current state and possibly input)

       =20 We=92ll use a Moore machine (output based only on current=20 state)

 

      =20

         =20 Finite State Machine   Concept.

 

       =    Finite State Machine=20 Controller

       =    Finite State = Diagram  Fig. = 5.42.

 

 

Logic Representative: Logic Equations

=B7 Next state from current=20 state

=B7 State 0 ->=20 State1

=B7 State 1 ->=20 S2,S6,S8,S9

=B7 State 2=20 ->S3,S5

=B7 State 3=20 ->S4

=B7 State 4 ->State=20 0

=B7 State 5 -> State=20 0

=B7 State 6 ->=20 S7

=B7 State 7 -> State=20 0

=B7 State 8 -> State=20 0

=B7 State 9 -> State=20 0

 

Or How can we reach each=20 state.

prior state &=20 condition

S4, S5, S7, S8, S9           &n= bsp;=20 -> State0

S0           &n= bsp;           &nb= sp;           &nbs= p;    =20 -> State 1

S1 &(op =3D sw || op =3D = lw)  -> State = 2

S2 &op =3D lw           &n= bsp;           =20 -> State 3

S3           &n= bsp;           &nb= sp;           &nbs= p;    =20 -> State 4

State 2 & op =3D  sw           &n= bsp;=20 -> State 5

S1 & op =3D Rtype           &n= bsp;   =20 -> State 6

State 6           &n= bsp;           &nb= sp;        =20 -> State 7

S1 & op =3D beq           &n= bsp;       =20 -> State 8

State 1 & op =3D j           &n= bsp;     =20 -> State 9

 

 

 

Implementation Technique: Programmed Logic=20 Arrays

Fig. PLA...

 

 

 

 

Multi cycle Control

=B7 Given numbers of FSM, can = determine=20 next state as function of inputs,

including current=20 state

 Turn these into Boolean = equations for=20 each bit of the next state lines

 Can implement easily using=20 PLA

 What if many more states, = many more=20 conditions?

 What if need to add a=20 state?