CDA3101 L05 Benchmarks MSS

Embed Size (px)

Citation preview

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    1/22

    CDA 3101 Fall 2013

    Introduction to Computer Organization

    Benchmarks

    30 August 2013

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    2/22

    Overview

    Benchmarks

    Popular benchmarks

    Linpack

    Intels iCOMP

    SPEC Benchmarks

    MIPS Benchmark

    Fallacies and Pitfalls

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    3/22

    Benchmarks

    Benchmarks measure different aspects of componentand system performance

    Ideal situation: use real workload

    Types of Benchmarks

    Risk: adjust design to benchmark requirements

    (partial) solution: use real programs and update constantly Engineering or scientific applications

    Software development tools

    Transaction processing

    Office applications

    Real programs Kernels

    Toy benchmarks Synthetic benchmarks

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    4/22

    A/ Benchmark Story

    1. You create a benchmark called the vmark

    2. Run it on lots of different computers

    3. Publish the vmarks in www.vmark.org

    4. vmark and www.vmark.org become popular Users start buying their PCs based on vmark Vendors would be banging on your door

    5. Vendors examine the vmarkcode and fix up their

    compilers and/or microarchitecture to run vmark6. Your vmarkbenchmark has been broken

    7. Create vmark2.0

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    5/22

    Performance Reports

    Reproducibility

    Include hardware / software configuration (SPEC)

    Evaluation process conditions

    Summarizing performance

    Total time:

    Arithmetic mean: AM = 1/n * exec timei

    Harmonic mean: HM = n / (1/ratei)

    Weighted mean: WM = wi * exec timei

    Geometric mean: GM = ( exec time ratioi)1/n

    GM (Xi) XiGM (Yi) Yi

    = GM

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    6/22

    Ex.1: Linpack Benchmark

    Mother of all benchmarks Time to solve a dense systems of linear equations

    DO I = 1, N

    DY(I) = DY(I) + DA * DX(I)

    END DO

    Metrics

    Rpeak: system peak Gflops

    Nmax: matrix size that gives the highest Gflops

    N1/2: matrix size that achieves half the rated Rmax Gflops

    Rmax: the Gflops achieved for the Nmax size matrix

    Used in http://www.top500.org

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    7/22

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    8/22

    Ex.3: SPEC CPU Benchmarks

    System Performance Evaluation Corporation Need to update/upgrade benchmarks Longer run time

    Larger problems

    Application diversity

    Rules to run and report Baseline and optimized

    Geometric mean of normalized execution times

    Reference machine: Sun Ultra5_10 (300-MHz SPARC, 256MB)

    CPU2006: latest SPEC CPU benchmark (4th version) 12 integer and 17 floating point programs

    Metrics: response time and throughput

    www.spec.org

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    9/22

    Ex.3: SPEC CPU Benchmarks

    1989-2006

    Previous Benchmarks, now retired

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    10/22

    Ex.3: SPEC CPU Benchmarks

    Observe: We will use SPEC 2000 & 2006 CPUbenchmarks in this set of notes.

    Task: However, you are asked to read about

    SPEC 2006 CPU benchmark suite, described at

    www.spec.org/cpu2006

    Result: Compare SPEC 2006 with SPEC

    2000 data www.spec.org/cpu2000 to answer

    the extra-credit questions in Homework #2.

    http://www.spec.org/cpu2006http://www.spec.org/cpu2000http://www.spec.org/cpu2000http://www.spec.org/cpu2006
  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    11/22

    SPEC CINT2000 Benchmarks

    1. 164.gzip C Compression

    2. 175.vpr C FPGA Circuit Placement and Routing

    3. 176.gcc C C Programming Language Compiler

    4. 181.mcf C Combinatorial Optimization

    5. 186.crafty C Game Playing: Chess

    6. 197.parser C Word Processing

    7. 252.eon C++ Computer Visualization

    8. 253.perlbmk C PERL Programming Language

    9. 254.gap C Group Theory, Interpreter

    10. 255.vortex C Object-oriented Database

    11. 256.bzip2 C Compression

    12. 300.twolf C Place and Route Simulator

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    12/22

    SPEC CFP2000 Benchmarks

    1. 168.wupwise F77 Physics / Quantum Chromodynamics

    2. 171.swim F77 Shallow Water Modeling

    3. 172.mgrid F77 Multi-grid Solver: 3D Potential Field

    4. 173.applu F77 Parabolic / Elliptic Partial Differential Equations

    5. 177.mesa C 3-D Graphics Library

    6. 178.galgel F90 Computational Fluid Dynamics7. 179.art C Image Recognition / Neural Networks

    8. 183.equake C Seismic Wave Propagation Simulation

    9. 187.facerec F90 Image Processing: Face Recognition

    10. 188.ammp C Computational Chemistry

    11. 189.lucas F90 Number Theory / Primality Testing12. 191.fma3d F90 Finite-element Crash Simulation

    13. 200.sixtrack F77 High Energy Nuclear Physics Accelerator Design

    14. 301.apsi F77 Meteorology: Pollutant Distribution

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    13/22

    SPECINT2000 Metrics

    SPECint2000: The geometric mean of 12 normalizedratios (one for each integer benchmark) when eachbenchmark is compiled with "aggressive" optimization

    SPECint_base2000: The geometric mean of 12

    normalized ratios when compiled with "conservative"optimization

    SPECint_rate2000: The geometric mean of 12normalized throughput ratios when compiled with

    "aggressive" optimization SPECint_rate_base2000: The geometric mean of 12

    normalized throughput ratios when compiled with"conservative" optimization

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    14/22

    SPECint_base2000 Results

    Alpha/Tru64

    21264 @ 667 MHz

    Mips/IRIX

    R12000@ 400MHz

    Intel/NT 4.0

    PIII @ 733 MHz

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    15/22

    SPECfp_base2000 Results

    Alpha/Tru64

    21264 @ 667 MHz

    Mips/IRIX

    R12000@ 400MHz

    Intel/NT 4.0

    PIII @ 733 MHz

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    16/22

    Effect of CPI: SPECint95 Ratings

    Microarchitecture improvements

    CPU time = IC * CPI * clock cycle

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    17/22

    Effect of CPI: SPECfp95 Ratings

    Microarchitecture improvements

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    18/22

    SPEC Recommended Readings

    SPEC 2006

    Survey of Benchmark Programs

    http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdf

    SPEC 2006 Benchmarks - Journal Articles onImplementation Techniques and Problems

    http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/

    SPEC 2006 Installation, Build, and Runtime Issueshttp://www.spec.org/cpu2006/issues/

    http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdfhttp://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/issues/http://www.spec.org/cpu2006/issues/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdf
  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    19/22

    Another Benchmark: MIPS Millions of Instructions Per Second MIPS = IC / (CPUtime * 106)

    Comparing apples to oranges

    Flaw: 1 MIPS on one processor does not accomplish

    the same work as 1 MIPS on anotherIt is like determining the winner of a foot race by counting

    who used fewer steps

    Some processors do FP in software (e.g. 1FP = 100 INT)

    Different instructions take different amounts of time Useful for comparisons between 2 processors from the

    same vendor that support the same ISA with the samecompiler(e.g. Intels iCOMP benchmark)

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    20/22

    Fallacies and Pitfalls

    Ignoring Amdahls law Using clock rate or MIPS

    as a performance metric

    Using the Arithmetic Mean ofnormalized

    CPU times (ratios) instead of the Geometric Mean

    Using hardware-independent metrics

    Using code size as a measure of speed

    Synthetic benchmarks predict performance They do not reflect the behavior of real programs

    The geometric mean of CPU times ratios is

    proportional to the total execution time [NOT!!]

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    21/22

    Conclusions

    Performance is specific to a particular program/s CPU time: only adequate measure of performance

    For a given ISA performance increases come from:

    increases in clock rate (without adverse CPI affects) improvements in processor organization that lower CPI

    compiler enhancements that lower CPI and/or IC

    Your workload: the ideal benchmark

    You should not always believe everything you read!

  • 7/29/2019 CDA3101 L05 Benchmarks MSS

    22/22

    Happy & Safe Holiday Weekend