## LSU EE 4720 -- Spring 2022 -- Computer Architecture
#
## ISA Families Overview,  MIPS/Others Comparison


## Contents
#
# Major ISA Families
# The RISC Family
# The CISC Family
# The VLIW Family
# MIPS, SPARC, Arm A64, and RISC-V Comparison

## Objectives
#
# Major ISA Families
#  Define, know main features of.
# SPARC
#  Understand enough to figure out simple programs and use a reference
#   for less simple ones.
#  Understand how the condition code register is used for branches.
# MIPS v. SPARC:  Instruction Similarities and Differences
#  Floating point, jumps, overflow, branch conditions, etc.
# Coding and Instructions
#  Understand how coding affects instructions. (Immed. sizes, opcodes, etc.)
#  Understand basic tradeoffs in coding alternatives. 
#   (Simpler format vs. larger immediates, etc.)
# Floating Point
#  Read and write MIPS programs using floating point instructions.
#  Understand how SPARC uses floating point.


################################################################################
## Major ISA Families

## ISA Families

# :Def: ISA Family
# A broad classification of ISAs based on a contemporary understanding
# of significant features and characteristics.

# There are many ISAs, with many characteristics.
#
# Two ISAs can be similar (MIPS, Alpha) or different (MIPS, IA-32).
#
# There are generally accepted /families/ of ISAs.
#   ISAs in the same family are similar.
#   ISAs in different families are very different.
#
# Three families are described below.  
#   More details covered in a different set.
#
 ##   RISC:  Simple Pipelined Implementations
 ##   CISC:  Powerful Instructions
 ##   VLIW:  Faster Multiple-Issue (covered later) Implementations.
#
#   The families above are mutually exclusive (an ISA can't be in more
#     than one).
#   There are additional families. (An ISA may not fit in to any of the three.)

## RISC: Reduced Instruction Set Computing
#
# Example: MIPS

 ## Goals
#
# Low-cost and fast pipelined implementations (based on 1980's technology).
# Simple to write compilers for.

 ## Current Status
#
# Dominant for cell phones, tablets, some technical workstations and
#  servers, used in some video game consoles, once used in the Mac.
#
# New ISAs and implementations continue to be developed.

 ## Characteristics
#
# Fixed Insn Size: All instructions are the same size, (usually 32 bits).
#
# Instructions allow simple control logic.
#
# Amount of work done by instructions balanced, for easy pipelining.
#
# Only "load" and "store" instructions allowed to access memory.
#   (Arithmetic instructions cannot access memory.)
#
# Enough registers to avoid frequent loads and stores.
#
# Few special-purpose registers.


 ## Examples
#
# MIPS    -- Embeddable cores. In The Day, Silicon graphics workstations.
#            https://en.wikipedia.org/wiki/SGI_Tezro
# RISC-V  -- A free (unencumbered) ISA, for research, teaching, commercial use.
# ARM32   -- Many embedded systems.
# ARM64   -- (Aarch64) NVIDIA hybrid GPU/CPU chips.
# POWER   -- IBM workstations and servers.
# PowerPC -- Wii, PS3, [Mac, when it had more syllables].
# SPARC   -- Sun and Fujitsu workstations, servers.
# Alpha   -- Developed by DEC, once 2nd largest computer comp. 
#            Great, but failed.
# PA-RISC -- Hewlett-Packard workstations.
#
# This class will frequently use MIPS and SPARC.

 ## Fixed-Size Instructions

# Advantages
#
#  Easy instruction fetch and decode - no need to "find" instruction.
#  Fixed locations for fields (rs, rt, etc).
#  Provide greater range in displacement CTIs (such as branches).
#
# Disadvantages
#
#  Can't store large immediates so
#    .. need two or more insn in some cases.
#
#  Simple instructions (addi r1, r1, 1) "waste" space.

 ## Amount of work done by instructions balanced, for easy pipelining.

#  Instructions that access memory (load, store, etc) ..
#  .. don't perform arithmetic (except for computing the memory address).
#  Instructions shouldn't need to re-use a stage.

# Advantages
#
#  Low-cost implementations. (Simple control logic, fewer interconnections.)
#  Easier to use elaborate implementation techniques (e.g., dynamic scheduling).
#
# Disadvantages
#
#  May need more instructions in a program.


####################
## CISC: Complex Instruction Set Computing
#
# Example: VAX

 ## Goals
#
# Provide powerful (do-everything) instructions.
# Popular in the 1970s and 1980s.

 ## Characteristics
#
# Instruction sizes vary.
# Large variety of immediate sizes and memory addressing modes.
# Moderate number of registers.
# Arithmetic and other instructions can access memory.

 ## Examples
#
# VAX  Very popular in early 80s.  Used in what were called minicomputers.
# Arguably: IA-32 and Intel 64,  popularly known as x86 and x86_64.

 ## Current Status
#
# Little new development, except for IA-32/Intel 64 ..
# .. because of large installed base.
# In 20th century, overtaken by RISC.


## VLIW: Very-Large Instruction Word
#
# Example: Itanium

 ## Goals
#
# Reduce cost (v. RISC) of implementations that operate on >1 insn / cycle.

 ## Characteristics
#
# Instructions handled in groups (usually of 3) called /bundles/.
# Information about instruction relationships provided to hardware.

 ## Examples
#
# Itanium     (Intel) Designed for general purpose use. Faded away.
# Tera        (Developed by Tera, now owned by Cray)  For scientific comp.
# TI Velocity (Texas Instruments)  For signal processing.

 ## Current Status
#
# Used in special purpose applications, such as signal processing.
# Has not caught on for general-purpose use.



################################################################################
## RISC v. CISC

 ## Fixed v. Variable Insn Size: Instruction Fetch Hardware
 #
 #  Instruction fetch hardware is much simpler with fixed-size instructions.

 # RISC
 # 
 # PC contains address of instruction currently being fetched.
 #
 # To get next instruction address just increment PC by 4 ..
 # .. and fetch 32 bits.
 #
 # That's it.

 # CISC
 #
 # PC contains address of instruction currently being fetched.
 #
 # The PC needs to be incremented by the size of the instruction
 # currently begin fetched ..
 #
 # .. but the size isn't known because the instruction isn't here yet ..
 #
 # .. so increment PC by some constant amount, perhaps 4 bytes.
 #
 # In ID shift and mask leftover IR from previous cycle ...


 ## Fixed v. Variable Insn Size:  Branch Targets

 # RISC
 # Immediate is number of insn to skip.

 ## Addressing Modes

 # Quick examples of addressing modes in CISC instructions.

 # Some Possible CISC Instructions
 add (r1), r2, 0x12345678         # Mem[r1] = r2 + 0x12345678
 add (r1), r2, (0x12345678)       # Mem[r1] = r2 + Mem[0x12345678]
 add (r1), r2, ((0x12345678))     # Mem[r1] = r2 + Mem[Mem[0x12345678]]
 add (r1), r2, ((r3))             # Mem[r1] = r2 + Mem[Mem[r3]]
 add ((r1)), r2, ((0x12345678))   # Mem[Mem[r1]] = r2 + Mem[Mem[0x12345678]]
 add r1, r1, (r3+)                # r1 = r1 + Mem[r3];  r3 = r3 + sizeof(int)
 add r1, r1, (r3+)                # r1 = r1 + Mem[r3];  r3 = r3 + sizeof(int)
 add (r1+r7), 0x4322(r2), ((0x12345678))

 # A Possible CISC Instruction        
 add (r1), (r2), 0x12345678

 # Equivalent MIPS code
 lw r20, 0(r2)
 lui r21, 0x1234
 ori r21, r21, 0x5678
 add r23, r20, r21
 sw r23, 0(r1)


################################################################################
## ISAs Used in EE 4720


 ## MIPS
#
# Used in the Patterson & Hennessy and Hennessy & Patterson 3rd Edition texts.
# An early and still popular RISC ISA.

 ## DLX
#
# Phased out in this class, appears in very old homeworks and exams.
# Used in the Hennessy & Patterson 2nd Edition text.
# A simplified form of MIPS.

 ## RISC-V
#
# Perhaps this can be though of as DLX redux (DLX returns). Developed
# for research, teaching, and commercial uses. The "V" is a Roman
# numeral, unlike, say, the V in Sputnik V. In some ways similar to
# DLX in that it is of academic origin, but goals are more
# ambitious. Used in the recent Patterson/Hennessy Computer
# Architecture texts. Also used in real products, mostly for embedded
# and lightweight applications.
#
# RISC-V has many fewer instructions than its contemporaries. It also
# has a modular design.

 ## SPARC
#
# Used in ECE Sun computers.
# An example of an early RISC ISA.

 ## VAX
#
# A good example of a CISC ISA.
#
# This ISA style considered obsolete ..
# .. covered to emphasize how ISA styles driven by contemporary hardware.

 ## IA-32 / Intel 64
#
# It's everywhere.
# Since it's no longer covered in EE 3750 it ought to be covered here.
# Sort of like something built as a small cottage 150 years ago ..
# .. that had since been expanded every 20 years or so.

 ## Arm A32  (Arm v7 and Arm v8 A32)
#
# It started out as an ISA for embedded applications, and it has enjoyed
# steady growth. It replaced IA-32 in EE 3750 (µProcessors).
# A32 is the winner of the EE 4720 Quirky ISA Award [tm].

 ## Arm A64
#
# Less quirky than A32. Appropriate for medium (cell phone, laptop)
# and larger (workstation, server) systems.
# An example of a late RISC ISA.

 ## Itanium
#
# An example of a 21'st century VLIW ISA. Perhaps a good example of
# Brooks' second system effect (Brooks 75). Implementations were a
# disappointment.



################################################################################
## RISC ISA Comparison: MIPS, SPARC, ARM A64, RISC-V


 # :Def: Address Space Size [of an ISA]
#       The number of bits in a virtual memory address.
#
#  MIPS-I has an address space size of 32 bits.
        lw r1, 4(r2)  # 4+r2 is a 32-bit quantity.
#  MIPS-IV has an address space size of 64 bits. 
        lw r1, 4(r2)  # 4+r2 is a 64-bit quantity.
#
#  An important ISA specification.
#
#  It's so important, that the phrase "address space size" is omitted:
#    "MIPS-I is a 32-bit ISA."
#    "Alpha is a 64-bit ISA."
#
#  Typically, an ISA with an A-bit address space:
#    Has A-bit general-purpose registers. (Bit enough to hold an address.)
#    Has fast A-bit integer instructions.


 ## Summary of Differences
#
 ## MIPS  (MIPS-I, MIPS-IV, MIPS64)
#
#   Early RISC. 
#   Started as 32 bit-ISA, later 64-bit version developed.
#   Simple and with few quirks.
#   MIPS should be familiar to EE 4720 students at this point.
#
 ## SPARC  (v8, v9)
#
#   Early RISC.
#   Started as 32 bit-ISA, later 64-bit version developed.
#   More elaborate than MIPS.
#
 ## ARM A64  (a.k.a. Aarch64)
#
#   Late RISC. (About 2016)
#   A new ISA, not a 64-bit superset of A32.
#   Large number of instructions. Especially:
#     Scaled arithmetic (shift second source operand).
#     Indexed load and stores (address is sum of two registers).
#     Bit manipulation.
#
 ## RISC-V  RV64I
#
#   Late RISC.
#   A new ISA, not an extension of MIPS (though similar).
#   Design goal was simplicity and modularity.
#   Base version (rv32i, rv64i) has few instructions.
#     RV32I has just 40 instructions.




 ## Address Space Sizes in Comparison ISAs

 # MIPS and SPARC
#
#  Both started out with a 32-bit address space (32-bit version). 
#  Both later were extended with 64-b versions.
#
#  32-bit:  MIPS-I,  SPARC v7, v8
#  64-bit:  MIPS-IV, SPARC v9
#
#  Many other early-RISC ISAs started as 32-bit ISAs ..
#  .. and later were extended with 64-bit superset versions.
#
#  A 64-bit superset version of a 32-bit ISA ..
#  .. can run code written for the 32-bit version.
#
#
 # ARM and RISC-V
#
#  Both ARM and RISC-V have INCOMPATIBLE 32- and 64-bit versions.
#
#  ARM v7
#    Developed in mid 1980s.
#    A 32-bit ISA.  Later (sort of) referred to as A32
#  ARM v8
#    Released in 2016.
#    Defines A64, a 64-bit ISA, which is nothing like A32.
#    A64 aka Aarch64
#
#  RISC-V
#    Unlike ARM, all address space size versions developed at the same time.
#
#    RV32I*  32-bit ISA versions. (There are several extensions.)
#    RV64I*  64-bit ISA versions. (There are several extensions.)
#    RV128I*  128-bit ISA versions. (There are several extensions.)



## Registers and Memory
#
 ## Common to MIPS, SPARC, ARM A64, RISC-V, and many other RISC ISAs.
#
#   32 general-purpose registers (GPRs) ..
#   .. including a zero register.
#   Each GPR holds 32 bits in 32-bit versions.
#   Each GPR holds 64 bits in 64-bit versions.
#
#   Floating point instructions do not operate on GPRs.
#

 ## Floating Point in MIPS and SPARC, and many early RISC ISAs
#
#     32-bit versions:
#        32 floating-point registers.
#        FP registers are 32 bits ..
#        .. BUT can be used in pairs for 64-bit values.
#
#        Pairing of 32-FP registers is a feature of many early
#        RISC ISAs.
#
#     64-bit versions:
#        32 floating-point registers.
#        FP registers are 64 bits.

 ## Floating Point in ARM A64
#
#    Common to many 21st century ISAs ..
#    .. ARM A64 defines a set of vector registers ..
#    .. which are used by floating-point (and other) instructions.
#
#    32 128-bit vector registers.
#       A register can be used to hold a single value. (A scalar.)
#         For example, one 32-bit value.
#       A register can be used to hold a several values. (A vector.)
#         For example, four 32-bit values.
#

 ## Floating Point in RISC-V Variants
#
#   FP Register Size Depends on Variant
#     All FP variants: 32 FP registers.
#
#   FP variant indicated by a letter:
#     F: 32-bit registers.   rv32if,  rv64if
#        32-bit FP instructions.
#        Unlike other ISAs, no 64-bit FP instructions.
#     D: 64-bit registers.   rv32id,  rv64id
#        32-bit and 64-bit FP instructions.
#     Q: 128-bit registers.  rv32iq,  rv64iq
#        32-bit, 64-bit and 128-bit FP instructions.
#


 ## ARM A64 but not others
#
#        Register 31 (not 0) in most instructions is the constant zero ..
#        .. but in some instructions it is an ordinary register.

 ## MIPS but not others
#
#        Two 32-bit integer multiplication and division registers (hi/lo).
#
#        Four sets of coprocessor registers. Each set has 32 registers.
#
#          Co-processor 0: Processor and system control.
#          Co-processor 1: MIPS-32 floating-point
#          Co-processor 2: Reserved for special-purpose designs.
#          Co-processor 3: MIPS-64 floating-point
#
 ## SPARC but not MIPS
#
# SPARC: Y register for use in multiplication.
#        Windowed integer registers:
#          ISA Defines 16 + 16n integer registers, 4 <= n <= 32.
#          An instruction "sees" only 32 of them at a time.
#          Hardware saving and restoring of registers, meant for call & return.
#
#
 ## Register Names
#
 ## MIPS32 and MIPS64
#
#  GPR: $0 - $31.  Register $0 is always zero.
#  GPRs also have names:
#
#      Names     Numbers   Suggested Usage
#      $zero:    0       The constant zero.
#      $at:      1       Reserved for assembler.
#      $v0-$v1:  2-3     Return value
#      $a0-$a3:  4-7     Argument
#      $t0-$t7:  8-15    Temporary (Not preserved by callee.)
#      $s0-$s7: 16-23    Saved by callee.
#      $t8-$t9: 24-25    Temporary (Not preserved by callee.)
#      $k0-$k1: 26-27    Reserved for kernel (operating system).
#      $gp      28       Global Pointer
#      $sp      29       Stack Pointer
#      $fp      30       Frame Pointer
#      $ra:     31       Return address.
#
#  The same register names are used in 32- and 64-bit versions.
#
#  MIPS:     $hi, $lo.  Used for product, quotient, and remainder.
#
 ##  SPARC:    General-Purpose Registers divided into four sets of eight:
#
#      Names    Numbers  Description
#      %g0-%g7  0-7      Global. %g0 is always zero.
#      %o0-%o7  8-15     Output. Used for function arguments by caller.
#      %l0-%l7  16-23    Local.
#      %i0-%i7  24-31    Input.  Used for function arguments by callee.
#
#      SAVE and RESTORE instructions "copy" registers.
#      Register window feature eliminates need to have code save and restore
#        registers, as long as there are available registers.
#
 ##  RISC-V:
#
#  GPR: x0 - x31.  Register x0 is always zero.
#  GPRs also have names:
#
#      Names    Numbers   Suggested Usage
#      zero:    x0        The constant zero.
#      ra       x1        Return address.
#      sp       x2        Stack pointer.
#      gp       x3        Global pointer.
#      tp       x4        Thread pointer.
#      t0-3     x5-x7     Temporary. (Caller-save.)
#      s0/fp    x8        Frame pointer.
#      s1       x9        Callee save.
#      a0-a7    x10-x17   Function arguments and return values.
#      t3-6     x18-x31   Temporary. (Caller-save.)
#
#  The same register names are used in 32- and 64-bit versions.

#
#

 ##  ARM A64:
#
# Register names indicate size. 64 bit: x0,..,x30  32 bit: w0,..,w30


 ## Floating-Point Register Names
#
#  MIPS FPR:  $f0 - $f31  (Also called co-processor 1 registers.)
#  SPARC FPR: %f0 - %f31
#  ARM A64: Name indicates how much of register is accessed.
#  RISC-V: f0-f31, and ABI naming.
#
#
 ## Memory
#
#  All32: 32-bit address space.
#         Aligned Access (Address must be multiple of size.)
#
#  SPARC:  Big Endian.
#  MIPS:   Either (bi-endian). (Big endian used in class.)
#  RISC-V: Little endian.  Unaligned access allowed but may be slower.


 ## Some Assembly Language Differences
#
# MIPS, RISC-V, ARM A64:  Destination is first (leftmost) operand.

        add $s1, $s2, $s3  # s1 = s2 + s3
        add x1, x2, x3     # A64
        add x1, x2, x3     # RISV-V


# SPARC:  Destination is last (rightmost) operand.

        add %l2, %l3, %l1  # %l1 = %l2 + %l3

#
# MIPS, RISC-V: Parenthesis used for dereference, offset is outside parenthesis.

        lw $s1, 4($s2)    # MIPS
	lw a1,  4(a2)     # RISC-V

# SPARC, A64: Square brackets used for dereference, offset is inside:

        ld [%l2+4], %l1   # SPARC
	ldr w1, [x2, 4]   # A64


# Note: Syntax highlighting is for MIPS, so other instructions
# may not be colored properly.

## Basic Three-Register Integer Instructions

# MIPS:  add, addu, sub, subu, and, or, xor
# RV:    add,       sub,       and, or, xor
# SPARC: add,       sub,       and, or, xor, andn (and-not), orn
# A64:

# The MIPS addu and subu instructions are not unsigned.

        add $1, $2, $3    # MIPS,   $1 = $2 + $3
        add a1, a2, a3    # RISV-V
        add x1, x2, x3    # A64 64-bit add
        add w1, w2, w3    # A64 32-bit add (high bits of dest are zeroed.)
        add %g2, %g3, %g1 # SPARC   g1 = g2 + g3


## Basic Two-Register + Immediate Integer Instructions

# MIPS:  addi,       andi, ori, xori
# SPARC: add,  sub,  and,  or,  xor  (Immediate uses same opcode.)
# Immediate Sizes
#   MIPS:   16 bits for most instructions.
#   SPARC:  13 for arithmetic insn.  22 bits for sethi (set hi)
#   RISC-V: 12 for arithmetic insn.  20 bits for lui (load upper immediate)
#   A64:    12 for arithmetic insn.  16 bits for move, 21 bits for PC-rel ops.

        addi $t0, $t1, 5  # MIPS     I Format
        add %l1, 5, %l0   # SPARC    Format 3b

# MIPS does not have an immediate subtract, others do.



################################################################################
## Instruction Coding


 ## The Three MIPS Instruction Formats
#
# R Format:  Typically used for three-register instructions.
# I Format:  Typically used for instructions requiring an immediate.
# J Format:  Used for jump instructions.

 ## The Three (or six) SPARC Instruction Formats
#
# Format 1:  Used for calls.
# Format 2a: Used for sethi (like lui).
# Format 2b: Used for branches.
# Format 3a: Typically used for three-register and load/store instructions.
# Format 3b: Typically used for instructions requiring an immediate.
# Format 3c: Typically used for three-register floating-point instructions.


 ## The Four RISC-V rv32i and rv64i Instruction Formats
#
# R Format:  Typically used for three-register instructions.
# I Format:  Typically used for instructions requiring an immediate.
# S Format:  Typically used for stores.
# J Format:  Used for jump instructions.



 ## MIPS R Format
# _________________________________________________________________
# | opcode    | rs      | rt      | rd      | sa      | function  |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:26:  opcode                               First part of opcode.
# 25:21:  rs            (Register Source)      Source register one.
# 20:16:  rt            (Register Target)      Source register two.
# 15:11:  rd            (Register Destination) Destination register.
# 10: 6:  sa            (Shift Amount)         Five-bit immediate.
#  5: 0:  function                             Second part of opcode.
#
        add $s0, $s1, $s2  # $s0 = $s1 + $s2


 ## SPARC Format 3a (op =2 or 3,  i = 0)
# _________________________________________________________________
# | op| rd      | op3       | rs1     |i|       asi     | rs2     |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:25:  rd            Destination Register
# 24:19:  op3           Opcode for format 3.
# 18:14:  rs1           Source operand 1 register number.
# 13:13:  i             Immediate Sub-format. Zero in this case.
# 12:05:  asi           Address space identifier. Used by loads and stores.
# 04:00:  rs2           Source operand 2 register number.

        add %l2, %l3, %l1   # %l1 = %l2 + %l3


 ## RISC-V Format R (R-type)
# _________________________________________________________________
# | funct7      | rs2     | rs1     | fn3 | rd      | opcode      |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:25:  funct7                               Part of opcode.
# 24:20:  rs2           (Register Source 2)    Source register two.
# 19:15:  rs1           (Register Source 1)    Source register one.
# 14:12:  fn3           (funct3)               Part of opcode
# 11: 7:  rd                                   Destination
#  6: 0:  opcode                               Opcode
#



 ## MIPS I Format
# _________________________________________________________________
# | opcode    | rs      | rt      | immed                         |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:26:  opcode                               Entire opcode (for I and J).
# 25:21:  rs            (Register Source)      Source register one.
# 20:16:  rt            (Register Target)      Source register two.
# 15:0:   immed         (Immediate)            Immediate value.

        addi $t0, $t1, 2          lw $t0, 4($t1) 
        lui $t0, 0x1234           beq $t0, $t1   TARG


 ## SPARC Format 3b (op =2 (non-memory) or 3 (memory),  i = 1)
# _________________________________________________________________
# | op| rd      | op3       | rs1     |i| simm13                  |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:25:  rd            Destination Register
# 24:19:  op3           Opcode for format 3.
# 18:14:  rs1           Source operand 1 register number.
# 13:13:  i             Immediate Sub-format. One in this case.
# 12:00:  simm13        The immediate.

# Used for memory and arithmetic / logical.

        add %l1, 2, %l0;          ld [%l1+4], %l0


 ## RISC-V Format I (I-type)
# _________________________________________________________________
# | imm[11:0]             | rs1     | fn3 | rd      | opcode      |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:20:  imm[11:0]                            A 12-bit immediate.
# 19:15:  rs1           (Register Source 1)    Source register one.
# 14:12:  fn3           (funct3)               Part of opcode
# 11: 7:  rd                                   Destination
#  6: 0:  opcode                               Opcode
#
# Note: In load and store instructions fn3 is width of value.




 ## SPARC Format 2a (op = 0,  op2 = 4) (sethi)
# _________________________________________________________________
# | op| rd      | op2 | imm22                                     |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:25:  rd            Destination Register
# 24:22:  op2           Opcode for format 2.
# 21:00:  imm22         The immediate.
#
# Used for sethi.

 lui r1, 0x1234
 ori r1, r1, 0x5678 

 sethi %hi(0x12345678), %l1
 or %l1, %lo(0x12345678), %l1

 # 0x1234c000
 lui r1, 0x1234
 ori r1, r1, 0xc000

 sethi %hi(0x1234c000), %l1


 ## SPARC Format 2b (op = 0,  op2 = 2, 6, or 7) (branches)
# _________________________________________________________________
# | op|a| cond  | op2 | imm22                                     |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Description
#
# 31:30:  op            Opcode
# 29:29:  a             Typical: If one, instruction in delay slot annulled.
# 28:25:  cond          Condition. Some function of condition code register.
# 24:22:  op2           Opcode for format 2.
# 21:00:  imm22         The immediate, a branch displacement.
#
# Used for branches.


 ## MIPS J Format
# _________________________________________________________________
# | opcode    | ii                                                |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
# 31:26:  opcode                               Entire opcode (for I and J).
# 25:0:   ii            (Instruction Index)    Part of jump target.



 ## SPARC Format 1 (op = 1)
# _________________________________________________________________
# | op| disp30                                                    |
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#  3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
#  1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
#
# Bits    Field Name    Unabbreviated Name     Typical Use
#
#
# Used for calls.



## Shift Instructions

# MIPS:  sllv, srlv, srav, sll,  srl,  sra    Format R (All)
# SPARC: sll,  srl,  sra                      Format 3a, 3b


# MIPS constant shift instructions use special sa field, DLX use immed field.

 ## Shift Amount in Register

        sllv $1, $2, $3     # MIPS       R Format
        sll  %g2, %g2, %g1  # SPARC      Format 3a

 ## Shift Amount in Instruction

        sll  $1, $2, 5      # MIPS       R Format
        sll  %g2, 5, %g1    # SPARC      Format 3b

 ## Shift Amount Fields
#
#  MIPS:   Special sa field
#  SPARC:  Immediate field.


## Load Upper

# Loads the upper bits of a register with a constant.

# MIPS:  lui   Load upper 16 bits.
# SPARC: sethi Load upper 22 bits.

        lui $1, 0x1234       # MIPS
        sethi 0x123456, %l1  # SPARC

        # SPARC assembler "hi" macro extracts upper 22 bits from constant,
        # and "lo" macro extracts lower 10 bits (even though insn could use
        # thirteen bits).

        sethi %hi(0x12345678), %l1  #  %l1 = 0x12344000
        or %l1, %lo(0x12345678), %l1



## Load and Store Instructions

# MIPS:      lb,   lbu,  lh,   lhu,  lw, sb,  sh,  sw
# SPARC:     ldsb, ldub, ldsh, lduh, ld, stb, sth, st
# (SPARC instruction does same thing as MIPS above, e.g., lhu same as lduh).

        lw $1, 16($2)      # MIPS      I Format
        lw x1, 16(x2)      # RISC-V    I Format
        ld [%g2+16], %g1   # SPARC     Format 3b
        ldr x1, [x2, 16]   # ARM A64

        # No comparable MIPS.  (Note sum of two registers.)
        ld [%g2+%g3], %g1  # SPARC     Format 3a 
	ldr x1, [x2, x3, lsl 2]  # ARM A64  x1 = Mem[ x2 + ( x3 << 2 ) ]

        sw $1, 16($t5)     # MIPS      I Format
        sw x1, 16(x2)      # RISC-V    S Format
        st %g1, [%g5+16]   # SPARC
        st %g1, [%g5+%g6]  # SPARC


## Integer Branches

# MIPS: beq, bne, bgtz, bgez, bltz, blez
# DLX:  beqz, bnez
# SPARC: be, bne, bg, bge, blt, ble
# RISC-V: beq, bne, blt, bltu, bge, bgeu.

# Integer Condition Code Register: ICC, four bits:  N, Z, V, C

# MIPS:  Branches have delay slots.
#        Can compare registers.
#
# DLX:   No delay slots.
#        Can only test if a register is zero.
#
# SPARC: Delay slots (usually).
#        Branch based on condition codes. (Covered later.)
#
# RISC-V:No delay slots.
#        Can compare equality or magnitude of two registers.
#
# ARM A64: No delay slots.
#        Uses condition code registers.

# Later RISC ISAs:
#        No delay slot, because little advantage, more complexity
#        in implementations > 5 stages, superscalar, etc.
#
#        Can do register comparison in branch. 

        # MIPS
        beq $3, $4, TARGET
        nop
        xor $5, $6, $7

        # SPARC
        subcc %l3, %l4, %l6  # l6 = l3 - l4;   icc = ....

        sub %l23, %l10, %l16  
        bgt TARGET
        xor
        lw
        
        blt TARGET
        nop
        xor %l6, %l7, %l5

 ## SPARC Annulled Branches
#
# Annulled Conditional Branch Behavior (SPARC):
#   If taken delay slot executed.
#   If not taken delay slot not executed.
#   (Behavior of branch always and never is different.)

        # SPARC
        beq,a TARGET       # Delayed branch
        add %l2, %l3, %l1  # Executed only if branch taken.
        xor %l6, %l7, %l5

# Annulled Branches For Short if/else Statements
#
# if ( i1 == i2 ) l10 = l2 + l3; else l11 = l2 - l3;
# g1 = g2 ^ g3;

        subcc %i1, %i2, %g0
        beq IF_PART
        nop
        j SKIP
        sub %l2, %l3, %l11    # Else part (not executed if branch taken)
IF_PART:
        add %l2, %l3, %l10    # If part (not executed if branch not taken)
SKIP:
        xor %g2, %g3, %g1


        subcc %i1, %i2, %g0
        beq,a  SKIP
        add %l2, %l3, %l10    # If part (not executed if branch not taken)
        sub %l2, %l3, %l11    # Else part (not executed if branch taken)
SKIP:
        xor %g2, %g3, %g1


## Jump

# MIPS:  j, jr
# SPARC: Special cases of jump and link (jmpl) and branch instruction.

# MIPS:  Delayed
# SPARC: Delayed

# MIPS:  Immediate is region.
# SPARC: Immediate is displacement.

# PC= 0x12345678

# ii 0x3ffffff

# Region Address
# PC=   0x12345678
# 4ii   0x0ffffffc
# targ  0x1ffffffc


        # MIPS: TARGET is a 26-bit region.
        j TARGET
        nop

        # SPARC
        # Use branch always instruction for jumps.
        # TARGET is specified using a 22-bit displacement.
        ba TARGET
        nop

        # MIPS
        # TARGET in $t0
        jr $t0
        nop

        # SPARC
        # TARGET in %l0, %g0 is zero register.
        jmpl %l0 + 0, %g0
        nop

TARGET:


## Jump and Link Instructions

# MIPS:  jal, jalr
# SPARC: jmpl rs1 + simm13, rd (Jump to rs1 + simm13, rd = PC. )
# SPARC: jmpl rs1 + rs2, rd    (Jump to rs1 + rs2, rd = PC. )
# SPARC: call disp30           (Jump to PC + 4 * disp30 )

# MIPS: Register 31 holds return address (link) (by default)
# MIPS: Can specify return address register.

        jr $t1
        jmpl %l1 + %g0, %g0

        jalr $t1
        jmpl %l1 + 0, %o7   # Procedure call.  o7 = jmpl addr;  l1 is jump targ
        add 
        ...
        jmpl %o7 + 8, %g0  # Procedure return.

        jal TARG
        call TARG



TARG:

################################################################################
## Floating Point Summary

 ## Separate Floating Point Registers
#
# A feature of many RISC ISAs.
# Eases implementation.

 ## MIPS Floating Point 
#
# Supports IEEE 754 Single and Double FP Numbers
#
# Floating point handled by co-processor 1, one of 4 co-processors.
#
# MIPS floating point registers also called co-processor 1 registers.
# MIPS floating point instructions called co-processor 1 instructions.
#
# Registers named f0-f31.
# Load, store, and move instructions have "c1" in their names.
# Arithmetic instructions use ".s" (single) or ".d" (double) , or ".w" (int)
#  /completers/ to indicate operand type.


 ## SPARC Floating Point 
#
# Supports IEEE 754 Single, Double, Extended (128-bit) FP Numbers
#
# Storage for FP registers called the FP register file.
#
# Registers named %f0-%f31.
# Load and store instruction names end in "f" or "d"
# Arithmetic instructions start with "f" (single), "d" (double), or "q" (quad).

 ## Types of Floating-Point Instructions
#
# Briefly here, in detail later.
#
#
 ## Arithmetic Operations
#
# Add double-precision (64-bit) operands.
#
# MIPS:  add.d $f0, $f2, $f4
add.d $f0, $f2, $f4  # {$f0,$f1} = { $f2, $f3 } + { $f4, $f5 }
add.d $f0, $f2, $f5 # Illegal in MIPS 32
# SPARC: faddd %f0, %f2, %f4
# SPARC: fadds %f0, %f2, %f4
# SPARC: faddq %f0, %f4, %f8
#
#
 ## Load and Store
#
# Load double (eight bytes into two consecutive registers).
#
# MIPS:  ldc1 $f0, 8($t0)
# SPARC: ldf [%l0+8], %f0
#
#
 ## Move Between Register Files (E.g., integer to FP)
#
# MIPS:  mtc1   $f0, $t0
# SPARC: No such instructions.  Use store / load:
#        st %l0, [%sp+16]
#        ldf [%sp+16], %f0
#
 ## Format Conversion
#
# Convert from one format to another, e.g., integer to double.
#
# MIPS:  cvt.d.w  $f0, $f2
# SPARC: fitod %f0, %f2
#
#
 ## Floating Point Condition Code Setting
#
# Compare and set condition code.
#
# MIPS:  c.gt.d $f0, $f2
# SPARC: fcmpd  %f0, %f2  # Condition codes set to =, <, >, or unordered.
#
#
 ## Conditional Branch
#
# Branch on floating-point condition.
#
# MIPS:  bc1f TARGET   # Branch coprocessor 1 [condition code] false.
# SPARC: fbg TARGET    # Branch condition code greater than.
#

 ## FP Load and Store


        # MIPS
        #
        # Load word in to coprocessor 1
        lwc1 $f0, 4($t4)   #  $f0 = Mem[ $t4 + 4 ]

        #
        # Load double in to coprocessor 1
        ldc1 $f0, 0($t4)   #  $f0 = Mem[ $t4 + 0 ];  $f1 = Mem[ $t4 + 4 ]
        #
        # Store word from coprocessor 1.
        swc1 $f0, 4($t4)   #  $f0 = Mem[ $t4 + 4 ]
        #
        # Store double from coprocessor 1.
        sdc1 $f0, 0($t4)   #  Mem[ $t4 + 0 ] = $f0;  Mem[ $t4 + 4 ] = $f1

        # SPARC
        #
        # Load float. (NOT load double.)
        ldf [%i1+4], %f0     #  %f0 = Mem[ %i1 + 4 ]
        # Load double float. 
        lddf [%i1+8], %f0     #  %f0 = Mem[ %i1 + 8 ]; %f1 = Mem[ %i1 + 12 ]


 ## Move Instructions

        # MIPS
        #
        # Move to coprocessor 1
        mtc1 $t0, $f0
        #
        # Move from coprocessor 1.
        mfc1 $t0, $f0

        # SPARC
        #
        # No such instruction, instead use store and load.
        #
        st %l0, [%sp+16]
        ldf [%sp+16], %f0


 ## Data Type Conversion

        # Convert between floating-point and integer formats.
        # NOTE: Values don't convert automatically, need to use these insn.

        # MIPS
        #
        # To: s, d, w;  From: s, d, w
        #
        # cvt.TO.FROM fd, fs
        #
        cvt.d.w $f0, $f2     # $f0 = convert_from_int_to_double( $f2 )

        # SPARC Conversion
        #
        # From and to:  i, x (64-bit int), s, d, q (quad)
        #
        fitos %f2, %f0
        fitod %f2, %f0      # {$f0,$f1} = convert_from_int_to_double( $f2 )

 ## Setting Condition Codes

        # In preparation for a branch, set cond code based on FP comparison.

        # MIPS
        #
        # Compare:   fs COND ft
        # COND: eq, gt, lt, le, ge
        # FMT: s, d
        #
        # c.COND.FMT fs, ft
        # Sets condition code to true or false.
        #
        c.lt.d $f0, $f2    # CC = $f0 < $f2
        bc1t TARG          # Branch if $f0 < $f2
        nop
        c.ge.d $f0, $f2    # CC = $f0 < $f2
        bc1t TARG2          # Branch if $f0 < $f2
        nop
        # Reachable?

        # SPARC
        #
        # Unlike MIPS, no need to specify kind of comparison.
        # FCC set to one of four states: <, =, >, or unordered.
        #
        fcmps  %f0, %f2
        fcmpd  %f0, %f2

 ## FP Branches

        # MIPS
        #
        # Branch insn specifies whether CC register true or false.
        #
        bc1t TARG
        nop
        #
        bc1f TARG
        nop

        # SPARC
        #
        # Branch insn specifies specific relationship, e.g., >=
        #
        fbg TARGET    # Branch condition code greater than.



## Integer Multiplication and Division

 # MIPS: Not an ordinary integer arithmetic instruction.
 #
 # (After MIPS I ordinary integer multiplication added to ISA.)
 #
 # Early SPARC (before v8): No multiply instruction, use a multiply
 # step (muls) many times to perform a multiplication.
 # SPARC v8 has a multiply instruction that uses ordinary registers
 # for the low 32 bits and a special register "Y" for the high 32 bits.

 ## Differing Approaches
#
# MIPS:  Use a special integer multiply and divide unit.
# SPARC: Use any integer reg for low 32 bits and Y register for high 32 bits

 ## MIPS Multiplication
#
# Product goes in to lo and hi registers.
#
# To multiply integers:
#
# Multiply
# Move product from lo and hi (if necessary) to integer registers.

        mult $t0, $t1  # {hi,lo} = $t0 * $t1
        mflo $t2      # $t2 = $lo

        div $t0, $t1   # hi = $t0 / t1;    %lo = $t0 % $t1


 ## SPARC Multiplication
#
#
    # l3 = l1 x l2
        smul %l1, %l2, %l3