LSU EE 7700-1 -- CA Research Methods -- Spring 2006
Processor Idea Evaluation
Evaluate Three Things
System Performance: performance of whole system.
Idea Performance: performance of idea, ignore rest of system.
Concept Performance: is it really working? Is it reaching its full potential?
System Performance
Goal: Estimate improvement in overall system performance.
Performance Measures
Execution time.
Most common.
Reported as speedup over a base system.
Power
Energy
System Variations
Base: A typical system.
Idea: System with new idea. Denote execution time t_idea.
Other: System with similar or competing research ideas.
Denote execution time of these systems, t_base, t_idea, and t_other.
Speedup of idea: t_base / t_idea.
Base System Choice
Goal:
A typical system that will be in use when your idea can be manufactured.
Practice (what's done):
As much of ideal as possible but...
reasonably close to what others in the literature simulate.
System Variation Experiments
Show how much better than Base and Others.
But don't show:
If it works on different configurations.
Why and how well it is working.
System Configuration Experiments
Show if idea is sensitive to configuration (performs better on some).
Vary configuration parameter that idea might be sensitive to (see examples).
Plot or list speedup v. configuration parameter (e.g., cache size).
System Configuration Experiments Example - Branch Prediction
Vary Pipeline Depth
Deeper pipelines increase importance of good branch prediction.
Deeper pipelines interfere with predictor table update...
hurting some predictors.
Vary Cache Size
Smaller caches mean more misses and so branches less important.
That is, smaller caches, less speedup.
Predictor Table Size
Impact on predictor performance is obvious.
Might show that Idea works better at smaller sizes, larger sizes, or all sizes.
System Configurations Typically Tested
Pipeline Depth
The time from when an instruction arrives to when its results are available.
Memory Latency
How long it takes to retrieve something from a memory device.
Cache Size
ROB (window) Size
For dynamically scheduled (out-of-order) processors.
The number of instructions that a processor can handle at once.
Fetch Width, Decode Width, Issue Width
The number of instructions a processor can fetch, decode, or start execution
at once.
A 4-way superscalar processor has a decode width of 4.
Number of Arithmetic and other Functional Units
Idea Size
How much chip area given to idea. Ideally, true area, but often
the number of bits of storage needed for the idea.
Benchmark Sensitivity
Benchmark Choices
Programs "narrow" target users expected to run.
These are programs Idea works well on.
Example: crafty (chess), go (go).
Programs "broad" target users expected to run.
These are large class of programs Idea works well on.
Example: SPECCPU integer programs. (Includes crafty.)
Programs other researchers use. (For comparison.)
Reporting Results
For at least one configuration, show benchmarks individually.
Might separately show results in narrow and broad benchmark set.
Idea Performance
How well does idea, not the whole system, perform.
Can show how well idea solves problem, ignoring how important problem is.
Example - Branch Prediction
Problem: branch mispredictions.
Show branch prediction accuracy.
Example - Cache Designs
Problem: cache misses.
Show cache hit ratio.
Concept Performance
This section for Ideas that strongly depend on program and system behavior.
Does: (Concept performance as described here does apply.)
Branch predictor, cache replacement policy.
Does not: (Concept performance as described here does not apply.)
Faster division hardware.
Improvement in division speed not affected by numbers being divided.
Overall speedup is affected by number of divides in benchmark, but that's system
performance, not concept performance.
Concept
Concept is particular behavior or situation idea targets, and how it is exploited.
Example - Local Predictor
Behavior: Repeating outcome patterns that many branches might have.
Exploitation: Use BHT to remember pattern and PHT to remember outcome.
Concept Performance
How often does behavior occur?
For local predictor, how many branches have repeating patterns?
How well is it handled?
Are they all being predicted?
Measuring Concept Performance
Method 1: (easier)
Use idealized version of idea.
For example, local predictor with unlimited size BHT and large local history.
Method 2: (may be harder)
Modify simulator to detect behavior.
Compare number of times Idea finds behavior to number of times simulator does.