18
HPC101: How to use a Supercomputer? HPC Saudi 2017

HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

HPC101: How to use a Supercomputer?

HPC Saudi 2017

Page 2: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

HPC101: Introduction to High Performance

Computing Saber Feki

Computational Scientist Lead KAUST Supercomputing Core Lab

Page 3: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Agenda

�  8:30am: Introduction on HPC (Dr. Saber Feki)

�  8:40am: Overview of Shaheen II Architecture

�  8:50am: How to get access on Shaheen (Dr. Bilel Hadri)

�  9:05am: Programming Environment

�  9:30am: Runtime Environment (Dr. Samuel Kortas)

�  9:55am: Lustre Parallel Filesystem (Dr. Georgios Markomanolis)

�  10.15am: Coffee Break

�  10:30am: Application Software examples (Dr. Rooh Khurram and Dr. Zhiyong Zhu)

�  11:00am: Visualization Tools (Dr. Madhu Srinivasan, Ms. Dina Garatly)

�  11:45am: Tips & Best Practices (Dr. Bilel Hadri)

Page 4: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

What is HPC and Why HPC ?

�  https://www.youtube.com/watch?v=TGSRvV9u32M

Page 5: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

HPC101: Shaheen II Architecture Overview

Saber Feki Computational Scientist Lead

KAUST Supercomputing Core Lab

Page 6: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Shaheen II Overview

CO

MP

UT

E Node

Processor type: Intel Haswell

2 CPU sockets per node, 16 processors cores per CPU, 2.3GHz

6174 Nodes 197,568 cores

128 GB of memory per node

Over 790 TB total memory

Power Up to 2.8MW Water Cooled

Weight/Size More than 100 metrics tons

36 XC40 Compute cabinets, plus disk, blowers, management , etc..

Speed 7.2 Pflop/s speak theoretical performance

5.5 Pflop/s sustained LINPACK

Network Cray Aries interconnect with Dragonfly topology

57% of the maximum global bandwidth between the 18 groups of two cabinets.

STO

RE

Storage Sonexion 2000 Lustre appliance

17.6 petabytes of usable storage. Over 500 GB/s bandwidth

Burst Buffer DataWarp Solid Sate Devices (SDD) fast data

cache. Over 1 TB/s bandwidth, ( delivery September 2015)

Archiving

Tiered Adaptive Storage (TAS)

Hierarchical storage with 200 TB disk cache and 20 PB of tape storage, using a spectra logic tape library. ( can expand up to 100 PB)

AN

ALY

ZE

Analyzing Urika - GD

2TB of global shared-memory, 64 Threadstorm4 processors with 128 hardware threads per processor Over 75 TB of Lustre PFS

Page 7: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Intel Haswell CPU

Page 8: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

AVX 2 and FMA in Haswell

Page 9: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Intel Haswell CPU

Page 10: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

XC40 Compute Blades

Page 11: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

High Speed Network (HSN)

Page 12: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

High Speed Network (HSN)

Page 13: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

High Speed Network (HSN)

Page 14: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

XC40 Routing

Page 15: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

High Speed Network (HSN)

Page 16: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Networking

Page 17: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Shaheen II Sonexion �  Cray   Sonexion   2000   Storage   System  

consis3ng   of   12   cabinets   containing  a  total  of  5988  4TB  SAS  disk  drives.  

�  The   cabinets   are   interconnected   by  FDR  Infiniband  Fabric  .  

�  Each cabinet can contain up to 6 Scalable Storage Units (SSU); Shaheen II has a total of 72 SSUs.

�  As there are 2 OSS/OSTs for each SSU, this means that there are 144 OSTs in total  

Page 18: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained

Questions ?