CDA3101 L05 Benchmarks MSS

7/29/2019 CDA3101 L05 Benchmarks MSS

1/22

CDA 3101 Fall 2013

Introduction to Computer Organization

Benchmarks

30 August 2013


2/22

Overview

Benchmarks

Popular benchmarks

Linpack

Intels iCOMP

SPEC Benchmarks

MIPS Benchmark

Fallacies and Pitfalls


3/22

Benchmarks

Benchmarks measure different aspects of componentand system performance

Ideal situation: use real workload

Types of Benchmarks

Risk: adjust design to benchmark requirements

(partial) solution: use real programs and update constantly Engineering or scientific applications

Software development tools

Transaction processing

Office applications

Real programs Kernels

Toy benchmarks Synthetic benchmarks


4/22

A/ Benchmark Story

1. You create a benchmark called the vmark

2. Run it on lots of different computers

3. Publish the vmarks in www.vmark.org

4. vmark and www.vmark.org become popular Users start buying their PCs based on vmark Vendors would be banging on your door

5. Vendors examine the vmarkcode and fix up their

compilers and/or microarchitecture to run vmark6. Your vmarkbenchmark has been broken

7. Create vmark2.0


5/22

Performance Reports

Reproducibility

Include hardware / software configuration (SPEC)

Evaluation process conditions

Summarizing performance

Total time:

Arithmetic mean: AM = 1/n * exec timei

Harmonic mean: HM = n / (1/ratei)

Weighted mean: WM = wi * exec timei

Geometric mean: GM = ( exec time ratioi)1/n

GM (Xi) XiGM (Yi) Yi

= GM


6/22

Ex.1: Linpack Benchmark

Mother of all benchmarks Time to solve a dense systems of linear equations

DO I = 1, N

DY(I) = DY(I) + DA * DX(I)

END DO

Metrics

Rpeak: system peak Gflops

Nmax: matrix size that gives the highest Gflops

N1/2: matrix size that achieves half the rated Rmax Gflops

Rmax: the Gflops achieved for the Nmax size matrix

Used in http://www.top500.org


7/22


8/22

Ex.3: SPEC CPU Benchmarks

System Performance Evaluation Corporation Need to update/upgrade benchmarks Longer run time

Larger problems

Application diversity

Rules to run and report Baseline and optimized

Geometric mean of normalized execution times

Reference machine: Sun Ultra5_10 (300-MHz SPARC, 256MB)

CPU2006: latest SPEC CPU benchmark (4th version) 12 integer and 17 floating point programs

Metrics: response time and throughput

www.spec.org


9/22


1989-2006

Previous Benchmarks, now retired


10/22


Observe: We will use SPEC 2000 & 2006 CPUbenchmarks in this set of notes.

Task: However, you are asked to read about

SPEC 2006 CPU benchmark suite, described at

www.spec.org/cpu2006

Result: Compare SPEC 2006 with SPEC

2000 data www.spec.org/cpu2000 to answer

the extra-credit questions in Homework #2.
http://www.spec.org/cpu2006http://www.spec.org/cpu2000http://www.spec.org/cpu2000http://www.spec.org/cpu2006


11/22

SPEC CINT2000 Benchmarks

1. 164.gzip C Compression

2. 175.vpr C FPGA Circuit Placement and Routing

3. 176.gcc C C Programming Language Compiler

4. 181.mcf C Combinatorial Optimization

5. 186.crafty C Game Playing: Chess

6. 197.parser C Word Processing

7. 252.eon C++ Computer Visualization

8. 253.perlbmk C PERL Programming Language

9. 254.gap C Group Theory, Interpreter

10. 255.vortex C Object-oriented Database

11. 256.bzip2 C Compression

12. 300.twolf C Place and Route Simulator


12/22

SPEC CFP2000 Benchmarks

1. 168.wupwise F77 Physics / Quantum Chromodynamics

2. 171.swim F77 Shallow Water Modeling

3. 172.mgrid F77 Multi-grid Solver: 3D Potential Field

4. 173.applu F77 Parabolic / Elliptic Partial Differential Equations

5. 177.mesa C 3-D Graphics Library

6. 178.galgel F90 Computational Fluid Dynamics7. 179.art C Image Recognition / Neural Networks

8. 183.equake C Seismic Wave Propagation Simulation

9. 187.facerec F90 Image Processing: Face Recognition

10. 188.ammp C Computational Chemistry

11. 189.lucas F90 Number Theory / Primality Testing12. 191.fma3d F90 Finite-element Crash Simulation

13. 200.sixtrack F77 High Energy Nuclear Physics Accelerator Design

14. 301.apsi F77 Meteorology: Pollutant Distribution


13/22

SPECINT2000 Metrics

SPECint2000: The geometric mean of 12 normalizedratios (one for each integer benchmark) when eachbenchmark is compiled with "aggressive" optimization

SPECint_base2000: The geometric mean of 12

normalized ratios when compiled with "conservative"optimization

SPECint_rate2000: The geometric mean of 12normalized throughput ratios when compiled with

"aggressive" optimization SPECint_rate_base2000: The geometric mean of 12

normalized throughput ratios when compiled with"conservative" optimization


14/22

SPECint_base2000 Results

Alpha/Tru64

21264 @ 667 MHz

Mips/IRIX

R12000@ 400MHz

Intel/NT 4.0

PIII @ 733 MHz


15/22

SPECfp_base2000 Results

Alpha/Tru64

21264 @ 667 MHz

Mips/IRIX

R12000@ 400MHz

Intel/NT 4.0

PIII @ 733 MHz


16/22

Effect of CPI: SPECint95 Ratings

Microarchitecture improvements

CPU time = IC * CPI * clock cycle


17/22

Effect of CPI: SPECfp95 Ratings

Microarchitecture improvements


18/22

SPEC Recommended Readings

SPEC 2006

Survey of Benchmark Programs

http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdf

SPEC 2006 Benchmarks - Journal Articles onImplementation Techniques and Problems

http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/

SPEC 2006 Installation, Build, and Runtime Issueshttp://www.spec.org/cpu2006/issues/
http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdfhttp://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/issues/http://www.spec.org/cpu2006/issues/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdf


19/22

Another Benchmark: MIPS Millions of Instructions Per Second MIPS = IC / (CPUtime * 106)

Comparing apples to oranges

Flaw: 1 MIPS on one processor does not accomplish

the same work as 1 MIPS on anotherIt is like determining the winner of a foot race by counting

who used fewer steps

Some processors do FP in software (e.g. 1FP = 100 INT)

Different instructions take different amounts of time Useful for comparisons between 2 processors from the

same vendor that support the same ISA with the samecompiler(e.g. Intels iCOMP benchmark)


20/22

Fallacies and Pitfalls

Ignoring Amdahls law Using clock rate or MIPS

as a performance metric

Using the Arithmetic Mean ofnormalized

CPU times (ratios) instead of the Geometric Mean

Using hardware-independent metrics

Using code size as a measure of speed

Synthetic benchmarks predict performance They do not reflect the behavior of real programs

The geometric mean of CPU times ratios is

proportional to the total execution time [NOT!!]


21/22

Conclusions

Performance is specific to a particular program/s CPU time: only adequate measure of performance

For a given ISA performance increases come from:

increases in clock rate (without adverse CPI affects) improvements in processor organization that lower CPI

compiler enhancements that lower CPI and/or IC

Your workload: the ideal benchmark

You should not always believe everything you read!


22/22

Happy & Safe Holiday Weekend

Documents

CDA3101 L05 Benchmarks MSS