lrisc.s

## LSU EE 4720 -- Fall 2008 -- Computer Architecture
#
## ISA Families Overview, MIPS FP, MIPS/DLX/SPARC


## Under Construction
# 
# Time-stamp: <8 October 2008, 12:39:47 CDT, koppel@nested.ece.lsu.edu>

## Contents
#
# Major ISA Families
# Summary of MIPS and DLX Instructions
# MIPS and DLX Floating-Point Instructions

## Objectives
#
# Major ISA Families
#  Define, know main features of.
# DLX and SPARC
#  Understand enough to figure out simple programs and use a reference
#   for less simple ones.
# MIPS v. SPARC:  Instruction Similarities and Differences
#  Floating point, jumps, overflow, branch conditions, etc.
# Coding and Instructions
#  Understand how coding affects instructions. (Immed. sizes, opcodes, etc.)
#  Understand basic tradeoffs in coding alternatives. 
#   (Simpler format vs. larger immediates, etc.)
# Floating Point
#  Read and write MIPS programs using floating point instructions.
#  Understand how SPARC and DLX use floating point.


################################################################################
## Major ISA Families

## ISA Families

# :Def: ISA Family
# A broad classification of ISAs based on a contemporary understanding
# of significant features and characteristics.

# There are many ISAs, with many characteristics.
#
# Two ISAs can be similar (MIPS, Alpha) or different (MIPS, IA-32).
#
# There are generally accepted /families/ of ISAs.
#   ISAs in the same family are similar.
#   ISAs in different families are very different.
#
# Three families are described below.  
#   More details covered in a different set.
#
 ##   RISC:  Simple Design
 ##   CISC:  Powerful Instructions
 ##   VLIW:  Faster Multiple-Issue (covered later) Implementations.
#
#   The families above are mutually exclusive (an ISA can't be in more
#     than one).
#   There are additional families. (An ISA may not fit in to any of the three.)

## RISC
#
# Reduced Instruction Set Computing
# Example: MIPS

 ## Goals
#
# Low-cost and fast pipelined implementations (based on 1980's technology).
# Simple to write compilers for.

 ## Current Status
#
# Dominant for technical workstations, servers, and other large computers,
#  once used in the Macintosh.
#
# ISAs and implementations continue to be developed though momentum slowing.

 ## Characteristics
#
# Fixed Insn Size: All instructions are the same size, (usually 32 bits).
#
# Instructions allow simple control logic.
#
# Amount of work done by instructions balanced, for easy pipelining.
#
# Only "load" and "store" instructions allowed to access memory.
#   (Arithmetic instructions cannot access memory.)
#
# Moderate number of registers.
#
# Minimize number of special-purpose 


 ## Examples
#
# MIPS    (Silicon graphics workstations, embeddable cores.)
# SPARC   (Sun workstations, servers.)
# ARM     (Many embedded systems.)
# Alpha   
# PA-RISC (Hewlett-Packard workstations.)
# PowerPC (PS3, [Macintosh once].)
# POWER   (IBM workstations and servers.)
#
# This class will frequently use MIPS and SPARC.

 ## Fixed Size Instructions

# Advantages
#
#  Easy instruction fetch and decode - don't need to "find" instruction.
#  Fixed locations for fields (rs, rt, etc).
#  Provide greater range for displacement CTIs (such as branches).
#
# Disadvantages
#
#  Can't store large immediates so
#    .. need two or more insn in some cases.
#
#  Simple instructions (nop) "waste" space.

 ## Amount of work done by instructions balanced, for easy pipelining.

#  Separate load & store instructions.
#  Instructions shouldn't need to re-use a stage.

# Advantages
#
#  Low-cost implementations. (Simple control logic, fewer interconnections.)
#  Easier to use elaborate implementation techniques (e.g., dynamic scheduling).
#
# Disadvantages
#
#  May need more instructions in a program.


####################
## CISC
#
# Complex Instruction Set Computing
# Example: VAX

 ## Goals
#
# Provide powerful (do-everything) instructions.
# Popular in the 1970s and 1980s.

 ## Characteristics
#
# Instruction sizes vary.
# Moderate number of registers.
# Arithmetic and other instructions can access memory.

 ## Examples
#
# VAX  Very popular in early 80s.  Used in what were called minicomputers.
# Arguably: IA-32 (80x86,Pentium) 

 ## Current Status
#
# Little new development, except for IA-32 because of large installed base.
# In 20th century, outperformed by RISC.


## VLIW
#
# Very-Large Instruction Word
# Example: Itanium

 ## Goals
#
# Allow fast multiple-issue implementations by bundling instructions.

 ## Characteristics
#
# Instructions handled in groups (usually of 3) called /bundles/.
# Information about instruction relationships provided to hardware.

 ## Examples
#
# Tera      (Developed by Tera, now owned by Cray)  For scientific computation.
# Itanium   (Intel) For general purpose use. (Initially servers.)
# TI Velocity (Texas Instruments)  For signal processing.

 ## Current Status
#
# Used in special purpose applications, such as signal processing.
# Being introduced for general purpose use: Itanium



################################################################################
## ISAs Used in EE 4720


## MIPS
#
# Used in the Patterson & Hennessy and Hennessy & Patterson 3rd Edition texts.
# An early and still popular RISC ISA.

## DLX
#
# Being phased out, appears in old homeworks and exams.
# Used in the Hennessy & Patterson 2nd Edition text.
# A simplified form of MIPS.

## SPARC
#
# Used in ECE Sun computers.

## Use in EE 4720

# Many ISAs will be used, some are briefly covered.
#
# Details, including implementations, given for MIPS.
# Many examples will use SPARC.
#
# Older material uses DLX.


################################################################################
## MIPS, DLX, and SPARC

 ## Common: All are RISC ISAs.

# Fixed-Size: 4 characters (32-bits)
# 32-bit address space (32-bit version)
# 32 32-bit general-purpose registers.
# 32 32-bit floating-point registers (in one form or another).

# MIPS refers to MIPS-I
#   MIPS-I is a 32-bit version, current versions are 64 bits.
#
# SPARC refers to SPARC V8
#   SPARC V8 is a 32-bit version, SPARC V9 is 64 bits.

## Registers and Memory
#
 ## Common to MIPS, DLX, SPARC
#
#        32 general-purpose registers (GPR)
#        32 floating-point registers.
#        GPR are 32 bits.
#        FP registers are 32 bits but can be used in pairs.
#        FP instructions can only access floating-point registers.
#
 ## MIPS but not DLX or SPARC
#
#        Two 32-bit integer multiplication and division registers (hi/lo).
#
#        Four sets of coprocessor registers. Each set has 32 registers.
#
#          Co-processor 0: Processor and system control.
#          Co-processor 1: MIPS-32 floating-point
#          Co-processor 2: Reserved for special-purpose designs.
#          Co-processor 3: MIPS-64 floating-point
#
 ## SPARC but not DLX or MIPS
#
# SPARC: Y register for use in multiplication.
#        Windowed integer registers:
#          ISA Defines 16 + 16n integer registers, 4 <= n <= 32.
#          An instruction "sees" only 32 of them at a time.
#          Hardware saving and restoring of registers, meant for call & return.
#
#
 ## Register Names
#
#  GPR names usually reflect suggested usage, not fixed function.
#
#  DLX GPR:  r0 - r31.  Register r0 is always zero.
#
#  MIPS GPR: $0 - $31.  Register $0 is always zero.
#  MIPS GPRs also have names:
#      Names     Numbers   Suggested Usage
#      $zero:    0       The constant zero.
#      $at:      1       Reserved for assembler.
#      $v0-$v1:  2-3     Return value
#      $a0-$a3:  4-7     Argument
#      $t0-$t7:  8-15    Temporary (Not preserved by callee.)
#      $s0-$s7: 16-23    Saved by callee.
#      $t8-$t9: 24-25    Temporary (Not preserved by callee.)
#      $k0-$k1: 26-27    Reserved for kernel (operating system).
#      $gp      28       Global Pointer
#      $sp      29       Stack Pointer
#      $fp      30       Frame Pointer
#      $ra:     31       Return address.
#  MIPS:     $hi, $lo.  Used for product, quotient, and remainder.
#
#  SPARC:    Divided into four sets of 8:
#
#              Number  Name     Description
#              0-7     %g0-%g7  Global. %g0 is always zero.
#              8-15    %o0-%o7  Output. Used for function arguments by caller.
#              16-23   %l0-%l7  Local.
#              24-31   %i0-%i7  Input.  Used for function arguments by callee.
#
#              SAVE and RESTORE instructions "copy" registers.
#
 ## Floating-Point Registers
#
#  DLX FPR:    f0 -  f31.
#  MIPS FPR:  $f0 - $f31  (Also called co-processor 1 registers.)
#  SPARC FPR: %f0 - %f31
#
#
 ## Memory
#
#  All: 32-bit address space.
#        Aligned Access (Address must be multiple of size.)
#
#  SPARC, DLX:  Big Endian.
#  MIPS: Either (bi-endian). (Big endian used in class.)

 ## Some Assembly Language Differences
#
# MIPS, DLX:  Destination is first (leftmost) operand.

        add $s1, $s2, $s3  # s1 = s2 + s3
        add r1, r2, r3     # r1 = r2 + r3

# SPARC:  Destination is last (rightmost) operand.

        add %l2, %l3, %l1  # %l1 = %l2 + %l3

#
# MIPS, DLX: Parenthesis used for dereference, offset is outside parenthesis.

        lw $s1, 4($s2)

# SPARC: Square brackets used for dereference, offset is inside:

        ld [%l2+4], %l1




# Note: Syntax highlighting is for MIPS, so DLX and SPARC instructions
# may not be colored properly.

## Basic Three-Register Integer Instructions

# DLX:   add, addu, sub, subu, and, or, xor
# MIPS:  add, addu, sub, subu, and, or, xor
# SPARC: add,       sub,       and, or, xor, andn (and-not), orn

# The MIPS addu and subu instructions are not unsigned, DLX are.

        add $1, $2, $3    # MIPS,   $1 = $2 + $3
        add r1, r2, r3    # DLX,    r1 = r2 + r3
        add %g2, %g3, %g1 # SPARC   g1 = g2 + g3


## Basic Two-Register + Immediate Integer Instructions

# DLX:   addi, subi, andi, ori, xori
# MIPS:  addi,       andi, ori, xori
# SPARC: add,  sub,  and,  or,  xor  (Immediate uses same opcode.)
# DLX, MIPS Use 16-bit immediates;  SPARC uses 13 bits. 


        addi $t0, $t1, 5  # MIPS     I Format
        addi r2, r3, #5   # DLX      Type I
        add %l1, 5, %l0   # SPARC    Format 3b

# MIPS does not have an immediate subtract, DLX and SPARC do.



################################################################################
## Instruction Coding


 ## The Three MIPS, DLX Instruction Formats
#
# R Format:  Typically used for three-register instructions.
# I Format:  Typically used for instructions requiring an immediate.
# J Format:  Used for jump instructions.

 ## The Three (or six) SPARC Instruction Formats
#
# Format 1:  Used for calls.
# Format 2a: Used for sethi (like lui).
# Format 2b: Used for branches.
# Format 3a: Typically used for three-register and load/store instructions.
# Format 3b: Typically used for instructions requiring an immediate.
# Format 3c: Typically used for three-register floating-point instructions.


 ## DLX Type-R Instruction
# _________________________________________________________________
# | opcode    | rs1     | rs2     | rd      | func                |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
#  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
#
# Bits    Field Name    Typical Use
#
#  0: 5:  opcode        First part of opcode.
#  6:10:  rs1           Source register one.
# 11:15:  rs2           Source register two.
# 16:20:  rd            Destination register.
# 21:31:  function      Second part of opcode.
#
        add r1, r2, r3     # r1 = r2 + r3

 ## MIPS R Format
# _________________________________________________________________
# | opcode    | rs      | rt      | rd      | sa      | function  |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:26:  opcode                               First part of opcode.
# 25:21:  rs            (Register Source)      Source register one.
# 20:16:  rt            (Register Target)      Source register two.
# 15:11:  rd            (Register Destination) Destination register.
# 10: 6:  sa            (Shift Amount)         Five-bit immediate.
#  5: 0:  function                             Second part of opcode.
#
        add $s0, $s1, $s2  # $s0 = $s1 + $s2

 ## SPARC Format 3a (op =2 or 3,  i = 0)
# _________________________________________________________________
# | op| rd      | op3       | rs1     |i|       asi     | rs2     |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:25:  rd            Destination Register
# 24:19:  op3           Opcode for format 3.
# 18:14:  rs1           Source operand 1 register number.
# 13:13:  i             Immediate Sub-format. Zero in this case.
# 12:05:  asi           Address space identifier. Used by loads and stores.
# 04:00:  rs2           Source operand 2 register number.

        add %l2, %l3, %l1   # %l1 = %l2 + %l3


 ## DLX Type I
# _________________________________________________________________
# | opcode    | rs1     | rd      | immed                         |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
#  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
#
# Bits    Field Name    Typical Use
#
#  0: 5   opcode        First part of opcode.
#  6:10:  rs1           Source register one.
# 11:15:  rd            Destination register.
# 16:31   immed         Immediate


 ## MIPS I Format
# _________________________________________________________________
# | opcode    | rs      | rt      | immed                         |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:26:  opcode                               Entire opcode (for I and J).
# 25:21:  rs            (Register Source)      Source register one.
# 20:16:  rt            (Register Target)      Source register two.
# 15:0:   immed         (Immediate)            Immediate value.

        addi $t0, $t1, 2          lw $t0, 4($t1) 
        lui $t0, 0x1234           beq $t0, $t1   TARG

 ## SPARC Format 3b (op =2 (non-memory) or 3 (memory),  i = 1)
# _________________________________________________________________
# | op| rd      | op3       | rs1     |i| simm13                  |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:25:  rd            Destination Register
# 24:19:  op3           Opcode for format 3.
# 18:14:  rs1           Source operand 1 register number.
# 13:13:  i             Immediate Sub-format. One in this case.
# 12:00:  simm13        The immediate.

# Used for memory and arithmetic / logical.

        add %l1, 2, %l0;          ld [%l1+4], %l0

 ## SPARC Format 2a (op = 0,  op2 = 4) (sethi)
# _________________________________________________________________
# | op| rd      | op2 | imm22                                     |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:25:  rd            Destination Register
# 24:22:  op2           Opcode for format 2.
# 21:00:  imm22         The immediate.
#
# Used for sethi.

 ## SPARC Format 2b (op = 0,  op2 = 2, 6, or 7) (branches)
# _________________________________________________________________
# | op|a| cond  | op2 | imm22                                     |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:29:  a             Typical: If one, instruction in delay slot annulled.
# 28:25:  cond          Condition. Some function of condition code register.
# 24:22:  op2           Opcode for format 2.
# 21:00:  imm22         The immediate, a branch displacement.
#
# Used for branches.


 ## MIPS J Format
# _________________________________________________________________
# | opcode    | ii                                                |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:26:  opcode                               Entire opcode (for I and J).
# 25:0:   ii            (Instruction Index)    Part of jump target.



 ## SPARC Format 1 (op = 1)
# _________________________________________________________________
# | op| disp30                                                    |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
#
# Used for calls.



## Shift Instructions

# MIPS:  sllv, srlv, srav, sll,  srl,  sra    Format R (All)
# DLX:   sll,  srl,  sra,  slli, srli, srai   Format R, I
# SPARC: sll,  srl,  sra                      Format 3a, 3b


# MIPS constant shift instructions use special sa field, DLX use immed field.

 ## Shift Amount in Register

        sllv $1, $2, $3     # MIPS       R Format
        sll  r1, r2, r3     # DLX        Type R
        sll  %g2, %g2, %g1  # SPARC      Format 3a

 ## Shift Amount in Instruction

        sll  $1, $2, 5      # MIPS       R Format
        slli r1, r2, #5     # DLX        Type I
        sll  %g2, 5, %g1    # SPARC      Format 3b

 ## Shift Amount Fields
#
#  MIPS:   Special sa field
#  DLX:    Immediate field (used by many type I instructions).
#  SPARC:  Immediate field.


## Load Upper

# Loads the upper bits of a register with a constant.

# MIPS and DLX have different instruction names, but otherwise the same.

# MIPS:  lui   Load upper 16 bits.
# DLX:   lhi   Load upper 16 bits.
# SPARC: sethi Load upper 22 bits.

        lui $1, 0x1234       # MIPS
        lhi r1, #0x1234      # DLX
        sethi 0x123456, %l1  # SPARC

        # SPARC assembler "hi" macro extracts upper 22 bits from constant,
        # and "lo" macro extracts lower 10 bits (even though insn could use
        # thirteen bits).

        sethi %hi(0x12345678), %l1  #  %l1 = 0x12344000
        or %l1, %lo(0x12345678), %l1



## Load and Store Instructions

# DLX, MIPS: lb,   lbu,  lh,   lhu,  lw, sb,  sh,  sw
# SPARC:     ldsb, ldub, ldsh, lduh, ld, stb, sth, st
# (SPARC instruction does same thing as MIPS above, e.g., lhu same as lduh).

# MIPS and DLX very similar.

        lw $1, 16($2)      # MIPS      I Format
        lw r1, 16(r2)      # DLX       Type I
        ld [%g2+16], %g1   # SPARC     Format 3b

        # No comparable DLX or MIPS.  (Note sum of two registers.)
        ld [%g2+%g3], %g1  # SPARC     Format 3a 

        sw $1, 16($t5)   # MIPS
        sw 16(r10), r1   # DLX
        st %g1, [%g5+16] # SPARC
        st %g1, [%g5+%g6] # SPARC


## Integer Branches

# MIPS: beq, bne, bgtz, bgez, bltz, blez
# DLX:  beqz, bnez
# SPARC: be, bne, bg, bge, blt, ble

# Integer Condition Code Register: ICC, four bits:  N, Z, V, C

# MIPS:  Branches have delay slots.
#        Can compare registers.
#
# DLX:   No delay slots.
#        Can only test if a register is zero.
#
# SPARC: Delay slots (usually).
#        Branch based on condition codes. (Covered later.)
#

# Later RISC ISAs:
#        No delay slot, because little advantage, more complexity
#        in implementations > 5 stages, superscalar, etc.
#
#        Can do comparison in branch. 

        # DLX
        sub r6, r3, r4
        beq r6, TARGET
        xor r5, r6, r7

        # MIPS
        beq $3, $4, TARGET
        nop
        xor $5, $6, $7

        # SPARC
        subcc %l3, %l4, %g0  # Subtract and set condition codes.
        beq TARGET
        nop
        xor %l6, %l7, %l5

 ## SPARC Annulled Branches
#
# Annulled Conditional Branch Behavior (SPARC):
#   If taken delay slot executed.
#   If not taken delay slot not executed.
#   (Behavior of branch always and never is different.)

        # SPARC
        beq,a TARGET       # Delayed branch
        add %l2, %l3, %l1  # Executed only if branch taken.
        xor %l6, %l7, %l5

# Annulled Branches For Short if/else Statements
#
# if( i1 == i2 ) l1 = l2 + l3; else l1 = l2 - l3;
# g1 = g2 ^ g3;

        subcc %i1, %i2, %g0
        beq,a  SKIP
        add %l2, %l3, %l1    # If part (not executed if branch not taken)
        sub %l2, %l3, %l1    # Else part (not executed if branch taken)
SKIP:
        xor %g2, %g3, %g1


## Jump

# MIPS:  j, jr
# DLX:   j, jr
# SPARC: Special cases of jump and link (jmpl) and branch instruction.

# MIPS:  Delayed
# SPARC: Delayed
# DLX:   Not delayed.

# MIPS:  Immediate is region.
# DLX:   Immediate is displacement.

# PC= 0x12345678

# ii 0x3ffffff

# Region Address
# PC=   0x12345678
# 4ii   0x0ffffffc
# targ  0x1ffffffc

# Displacement (DLX)
# PC=   0x12345678
# 4ii   0x0ffffffc
# targ  PC + 4ii = 0x22345674 

        # MIPS: TARGET is a 26-bit region.
        # DLX: TARGET is a 26-bit displacement.
        j TARGET
        nop

        # SPARC
        # Use branch always instruction for jumps.
        # TARGET is specified using a 22-bit displacement.
        ba TARGET
        nop

        # MIPS, DLX
        # TARGET in $t0
        jr $t0
        nop

        # SPARC
        # TARGET in %l0, %g0 is zero register.
        jmpl %l0 + 0, %g0
        nop

TARGET:


## Jump and Link Instructions

# MIPS and DLX:  jal, jalr
# SPARC: jmpl rs1 + simm13, rd (Jump to rs1 + simm13, rd = PC. )
# SPARC: jmpl rs1 + rs2, rd    (Jump to rs1 + rs2, rd = PC. )
# SPARC: call disp30           (Jump to PC + 4 * disp30 )

# MIPS, DLX: Register 31 holds return address (link) (by default)
# MIPS: Can specify return address register.

        jr $t1
        jmpl %l1 + %g0, %g0

        jalr $t1
        jmpl %l1 + 8, %o7

        jal TARG
        call TARG


TARG:

################################################################################
## Floating Point Summary

 ## Separate Floating Point Registers
#
# A feature of many RISC ISAs.
# Eases implementation.

 ## MIPS Floating Point 
#
# Supports IEEE 754 Single and Double FP Numbers
#
# Floating point handled by co-processor 1, one of 4 co-processors.
#
# MIPS floating point registers also called co-processor 1 registers.
# MIPS floating point instructions called co-processor 1 instructions.
#
# Registers named f0-f31.
# Load, store, and move instructions have "c1" in their names.
# Arithmetic instructions use ".s" (single) or ".d" (double) , or ".w" (int)
#  /completers/ to indicate operand type.
#
 ## MIPS Co-Processors (Briefly)
#
# Each co-processor has a register set and instructions.
# Co-processor x abbreviated cpx.
#
# cp0: Used for virtual memory and exceptions (covered later).
# cp1: Used for floating point.
# cp2: Reserved for custom implementations.
# cp3: Intended for 64-bit floating point.


 ## DLX Floating Point 
#
# Supports IEEE 754 Single and Double FP Numbers
#
# Storage for FP registers called the FP register file.
#
# Registers named f0-f31.
# Load, store, and move instructions have "fp" in their names.
# Arithmetic instructions use "f" (single) or "d" (double) 
#  /completers/ to indicate operand type.

 ## SPARC Floating Point 
#
# Supports IEEE 754 Single, Double, Extended (128-bit) FP Numbers
#
# Storage for FP registers called the FP register file.
#
# Registers named %f0-%f31.
# Load and store instruction names end in "f" or "d"
# Arithmetic instructions start with "f" (single), "d" (double), or "q" (quad).

 ## Types of Floating-Point Instructions
#
# Briefly here, in detail later.
#
#
 ## Arithmetic Operations
#
# Add double-precision (64-bit) operands.
#
# MIPS:  add.d $f0, $f2, $f4
# DLX:   addd  f0, f2, f4
# SPARC: faddd %f0, %f2, %f4
#
#
 ## Load and Store
#
# Load double (eight bytes into two consecutive registers).
#
# MIPS:  ldc1 $f0, 8($t0)
# DLX:   ld f0, 8(r1)
# SPARC: ldf [%l0+8], %f0
#
#
 ## Move Between Register Files (E.g., integer to FP)
#
# MIPS:  mtc1   $f0, $t0
# DLX:   movi2fp f0, r2
# SPARC: No such instructions.  Use store / load:
#        st %l0, [%sp+16]
#        ldf [%sp+16], %f0
#
 ## Format Conversion
#
# Convert from one format to another, e.g., integer to double.
#
# MIPS:  cvt.d.w  $f0, $f2
# DLX:   cvt.i2d  $f0, $f2
# SPARC: fitod %f0, %f2
#
#
 ## Floating Point Condition Code Setting
#
# Compare and set condition code.
#
# MIPS:  c.gt.d $f0, $f2
# DLX:   gtd    f0, f2
# SPARC: fcmpd  %f0, %f2  # Condition codes set to =, <, >, or ?
#
#
 ## Conditional Branch
#
# Branch on floating-point condition.
#
# MIPS:  bc1f TARGET   # Branch coprocessor 1 [condition code] false.
# DLX:   bfpf TARGET   # Branch floating-point [condition code] false.
# SPARC: fbg TARGET    # Branch condition code greater than.
#

 ## FP Load and Store


        # MIPS
        #
        # Load word in to coprocessor 1
        lwc1 $f0, 4($t4)   #  $f0 = Mem[ $t4 + 4 ]

        #
        # Load double in to coprocessor 1
        ldc1 $f0, 0($t4)   #  $f0 = Mem[ $t4 + 0 ];  $f1 = Mem[ $t4 + 4 ]
        #
        # Store word from coprocessor 1.
        swc1 $f0, 4($t4)   #  $f0 = Mem[ $t4 + 4 ]
        #
        # Store double from coprocessor 1.
        sdc1 $f0, 0($t4)   #  Mem[ $t4 + 0 ] = $f0;  Mem[ $t4 + 4 ] = $f1

        # SPARC
        #
        # Load float. (NOT load double.)
        ldf [%i1+4], %f0     #  %f0 = Mem[ %i1 + 4 ]
        # Load double float. 
        lddf [%i1+8], %f0     #  %f0 = Mem[ %i1 + 8 ]; %f1 = Mem[ %i1 + 12 ]

 ## DLX FP Load and Store

        # Load float (32 bit)
        lf f0, 0(r1)
        # Load double (64 bit)
        ld f0, 0(r1)


 ## MIPS Move Instructions

        # Move to coprocessor 1
        mtc1 $t0, $f0

        # Move from coprocessor 1.
        mfc1 $t0, $f0

 ## DLX Move

        # Move X to Y
        # X,Y: fp, i
        # X,Y: f,d
        #
        # movX2Y rd, rs

        movi2f r4, f10


 ## MIPS Conversion

        # To: s, d, w;  From: s, d, w
        #
        # cvt.TO.FROM fd, fs

        cvt.d.w $f0, $f2

 ## DLX Conversion

        # X,Y: s, d, i
        #
        # cvtXtoY

        cvtitod f0, r2

 ## SPARC Conversion

        fitos %f2, %f0
        fitod %f2, %f0

 ## MIPS Condition Setting

        # Compare:   fs COND ft
        # COND: eq, gt, lt, le, ge
        # FMT: s, d
        #
        # c.COND.FMT fs, ft

        c.lt.d $f0, $f2

 ## DLX Condition Setting

        # Cond: gt, lt, eq, etc.
        # FMT: f, d
        #
        # <COND><FMT>

        ltd f0, f2
        ltf f0, f2

 ## MIPS FP Branch

        # Branch coprocessor 1 true.
        # Delayed branch.
        bc1t TARG

        bc1f TARG

 ## DLX FP Branch

        bfpt TARG
        bfpf TARG


## Integer Multiplication and Division

 # MIPS/DLX: Not an ordinary integer arithmetic instruction.
 #
 # (After MIPS I ordinary integer multiplication added to ISA.)
 #
 # Early SPARC (before v8): No multiply instruction, use a multiply
 # step (muls) many times to perform a multiplication.
 # SPARC v8 has a multiply instruction that uses ordinary registers
 # for the low 32 bits and a special register "Y" for the high 32 bits.

 ## Differing Approaches
#
# MIPS:  Use a special integer multiply and divide unit.
# DLX:   Use floating-point unit for integer multiply and divide.
# SPARC: Use any integer register for low 32 bits and Y register for high 32 bits

 ## MIPS Multiplication
#
# Product goes in to lo and hi registers.
#
# To multiply integers:
#
# Multiply
# Move product from lo and hi (if necessary) to integer registers.

        mult $t0, $t1  # {hi,lo} = $t0 * $t1
        mflo $t2      # $t2 = $lo
        

 ## DLX Multiplication
#
# Integer multiplication uses fp regs.

        # r3 = r1 x r2

        movi2fp f0, r1
        movi2fp f1, r2
        mul f3, f0, f1
        movfp2i r3, f3

 ## SPARC Multiplication
#
#
    # l3 = l1 x l2
        smul %l1, %l2, %l3