02-1 * * 02-1 Qualitative Computer Design Design guided by measured performance. Covered: - Benchmarks. (1.5) - Measures of performance. (1.5, 1.6) - Principles and measured performance. (1.6) - Example: Memory. (1.7) (Numbers refer to book sections.) 02-1 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-1 02-2 * * 02-2 Benchmarks Benchmark: program used to evaluate performance. Uses - Guide computer design. - Guide purchasing decisions. - Marketing tool. Guiding Computer Design Measure overall performance. Determine characteristics of programs. E.g., frequency of floating-point operations. Determine effect of design options. 02-2 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-2 02-3 * * 02-3 Choosing Benchmark Programs Important: Choice of programs for evaluation. Optimal but unrealistic: The exact set of programs customer will run. Problem: computers used for different applications. Therefore, must model typical users' workload. 02-3 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-3 02-4 * * 02-4 Options: Real Programs Programs chosen using surveys, for example. + Measured performance improvements apply to customer. - Large programs hard to run on simulator. (Before system built.) Kernels Use part of program responsible for most execution time. + Easier to study. - Not all program have small kernels. Toy Benchmarks Program performs simplified version of common task. + Easier to study. - May not be realistic. Synthetic Benchmarks Program "looks like" typical program, but does nothing useful. + Easier to study. - May not be realistic. Commonly Used Option Overall performance: real programs Test specific features: synthetic benchmarks. 02-4 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-4 02-5 * * 02-5 Benchmark Suites Definition: a named set of programs used to evaluate a system. Typically: - Developed and managed by a publication or non-profit organi- zation. E.g., Standard Performance Evaluation Corp., PC Magazine. - Tests clearly delineated aspects of system. E.g., CPU, graphics, I/O, application. - Specifies a set of programs and inputs for those programs. - Specifies reporting requirements for results. What Suites Might Measure - Application Performance E.g., productivity (office) applications, database programs. Usually tests entire system. - CPU and Memory Performance Ignores effect of I/O. - Graphics Performance 02-5 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-5 02-6 * * 02-6 Example, SPEC 95 Suites Respected measure of CPU performance. Managed by Standard Performance Evaluation Corporation,: : : : :a:non-profit organization funded by computer companies. Measures CPU and memory performance on integer and FP code. Uses common Unix programs such as perl, gcc, compress. Requires that results on each program be reported. Programs compiled with publicly available compilers and libraries. Programs compiled with and without expert tuning. SPEC 95 Suites and Measures CINT95 suite of integer programs run to determine: - SPECint95, execution time of tuned code. - SPECint_base95, execution time of untuned code. - SPECint_rate95, throughput of tuned code. FINT95 suite of floating programs run to determine: - SPECfp95, execution time of tuned code. - SPECfp_base95, execution time of untuned code. - SPECfp_rate95, throughput of tuned code. 02-6 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-6 02-7 * * 02-7 Other Examples BAPCO Suites, measure productivity app. performance on Windows 95. TPC, measure "transaction processing" system performance. WinMARK, graphics performance. 02-7 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-7 02-8 * * 02-8 Reporting Results Options for Combining Performance of Suite Members (This is harder than it sounds.) Let n denote number of programs in suite. Let ti denote run time of program i. Run times of suite members combined using: - Arithmetic Mean of Execution Times 1__X n ti. n i=1 Emphasizes programs with longest running time. - Weighted Arithmetic Mean Let wi 2 [0; 1] be weight (importance) of program i, P n where i=1 wi = 1. The weighted arithmetic mean is given by: Xn witi. i=1 Emphasizes programs based on importance to users. - Harmonic Mean of Execution Times n ! 1 1__X _1_ . n i=1 ti Emphasizes programs with shortest running time. 02-8 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-8 02-9 * * 02-9 - Geometric Mean of Execution Times vu ________ Y n unt ti. i=1 i j Useful property: GM___(Xi)_GM=(Yi)GM Xi__Yi. Emphasizes programs with large change in performance. - Normalized Execution Time For program i: ti=t0i, where t0iis execution time on reference machine. Emphasizes performance relative to a common computer. SPEC 95 reference: Sun SPARCstation 10. - Geometric Mean of Normalized Execution Times vu ________ Y n np ___________Qn unt ti_ ______i=1___ti_p_ 0 = n Q n 0. i=1 ti i=1 ti Insensitive to relative performance of suite members on refer- ence machine. 02-9 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli0* *2. 02-9 02-10 * * 02-10 Measuring Performance Bottom-Line Measures: Execution Time: [of a particular program] Time from program start to finish. Throughput: [of a collection of programs] Work per unit time. Execution time important to users. (Obviously.) Throughput important to accountants. 02-10 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli* *02. 02-10 02-11 * * 02-11 How Execution Time Reported By OS Total Run Time Called - Elapsed Time - Response Time - Wall-Clock Time During execution CPU may be in: - User mode: Possibly running "our" program. - System mode: Possibly running OS for our program. - In user or system mode running someone else's program. - Idle. Reported by Unix time Utility For a particular program (process): - User Time - System Time - Elapsed Time Additional Performance Measures Using Above Call a system running benchmark program only, unloaded. System Performance: Elapsed time on an unloaded system. CPU Performance: Sum of User and System time on an unloaded system. 02-11 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli* *02. 02-11 02-12 * * 02-12 Components of CPU Performance CPU Performance Product of Three Components: - Clock Frequency (OE) Determined by technology and influenced by organization. - Instructions Executed Per Clock Cycle (1=CPI ) Determined by organization. - Instruction Count (IC ) Determined by program and ISA. Execution time = OE CPI IC . 02-12 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli* *02. 02-12 02-13 * * 02-13 Principles of Computer Design Principles computer designers apply widely. - Make the common case fast. Obviously. - Amdahl's Law: Don't make common case too fast. As speed of one part increases: : : : : :impact on total performance drops. - Locality of Reference. Temporal: It might happen again soon. Spatial: It might happen to your neighbors soon too. 02-13 EE 4720 Lecture Transparency. Formatted 11:32, 15 January 1997 from lsli* *02. 02-13