Ceph Object Store

Preview:

Citation preview

Ceph Object Storeoder: Wie speichert man terabyteweise Dokumente

Daniel Schneller daniel.schneller@centerdevice.de @dschneller

Wer sind wir?

@dschneller @drivebytesting

Was machen wir?

Wo kamen wir her?

Warum wollten wir da weg?

Wohin wollten wir?

Und der Weg?

Ceph Grundlagen

“Unified, distributed storage system designed for excellent performance, reliability and scalability”

Stark skalierbar

Commodity Hardware

Kein Single Point of Failure

Ceph Komponenten

OSD DaemonsObject Storage Device Daemons

CRUSH AlgorithmusIntelligente Objektverteilung ohne zentrale Metadaten

RADOSReliable Autonomous Distributed Object Store

Objekte

Data PoolsSammelbecken für Objekte mit gleichen Anforderungen

Placement Groups

MonitorsErste Anlaufstelle für Clients

Hardware Setup

Storage Virtualization

Bare Metal Hardware

Compute Virtualization

Network Virtualization

Virtual Infrastructure

Application

Storage Virtualization

Bare Metal Hardware

Compute Virtualization

Network Virtualization

Virtual Infrastructure

Application

Baseline BenchmarksErwartungen definieren

StorageDisk I/O pro Node

Netzwerk

IEEE 802.3ad != IEEE 802.3ad

> cat /etc/network/interfaces ... auto bond2 iface bond2 inet manual bond-slaves p2p3 p2p4 # interfaces to bond bond-mode 802.3ad # activate LACP bond-miimon 100 # monitor link health bond-xmit_hash_policy layer3+4 # use Layer 3+4 for link selection pre-up ip link set dev bond2 mtu 9000 # set Jumbo Frames

auto vlan-ceph-clust iface vlan-ceph-clust inet static pre-up ip link add link bond2 name vlan-ceph-clust type vlan id 105 pre-up ip link set dev vlan-ceph-clust mtu 9000 # Jumbo Frames post-down ip link delete vlan-ceph-clust address ... netmask ... network ... broadcast ... ...

IEEE 802.3ad != IEEE 802.3ad

[node01] > iperf -s -B node01.ceph-cluster [node02] > iperf -c node01.ceph-cluster -P 2 [node03] > iperf -c node01.ceph-cluster -P 2 ------------------------------------------------------------ Server listening on TCP port 5001 Binding to local address node01.ceph-cluster TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49412 [ 5] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49413 [ 6] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59947 [ 7] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59946 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 342 MBytes 286 Mbits/sec [ 5] 0.0-10.0 sec 271 MBytes 227 Mbits/sec [SUM] 0.0-10.0 sec 613 MBytes 513 Mbits/sec [ 6] 0.0-10.0 sec 293 MBytes 246 Mbits/sec [ 7] 0.0-10.0 sec 338 MBytes 283 Mbits/sec [SUM] 0.0-10.0 sec 631 MBytes 529 Mbits/sec

IEEE 802.3ad != IEEE 802.3ad

[node01] > iperf -s -B node01.ceph-cluster [node02] > iperf -c node01.ceph-cluster -P 2 [node03] > iperf -c node01.ceph-cluster -P 2 ------------------------------------------------------------ Server listening on TCP port 5001 Binding to local address node01.ceph-cluster TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49412 [ 5] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49413 [ 6] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59947 [ 7] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59946 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 342 MBytes 286 Mbits/sec [ 5] 0.0-10.0 sec 271 MBytes 227 Mbits/sec [SUM] 0.0-10.0 sec 613 MBytes 513 Mbits/sec [ 6] 0.0-10.0 sec 293 MBytes 246 Mbits/sec [ 7] 0.0-10.0 sec 338 MBytes 283 Mbits/sec [SUM] 0.0-10.0 sec 631 MBytes 529 Mbits/sec ???

Messen!…und die Ergebnisse verstehen

CenterDevice

Gesamtarchitektur

Node 1

OSD 1

Node 2

Node 3

Node 4

OSD 48

Bare Metal

Ceph

Gesamtarchitektur

Node 1

OSD 1

Rados GW

Node 2

Rados GW

Node 3

Rados GW

Node 4

OSD 48

Rados GW

Bare Metal

Ceph

Gesamtarchitektur

Node 1

OSD 1

Rados GW

Node 2

Rados GW

Node 3

Rados GW

Node 4

OSD 48

Rados GW

VM 1

HAProxy

VM 1

HAProxy

VM 1

HAProxy

VM …

HAProxyVMs

Bare Metal

Ceph

Gesamtarchitektur

Node 1

OSD 1

Rados GW

Node 2

Rados GW

Node 3

Rados GW

Node 4

OSD 48

Rados GW

VMs

Bare Metal

VM 1

Ceph

HAProxy

CenterDevice

Swift

VM 1

HAProxy

CenterDevice

Swift

VM 1

HAProxy

CenterDevice

Swift

VM …

HAProxy

CenterDevice

Swift

Vorteile

Nachteile

Caveats

CephFSNot recommended for production data.

ScrubbingIntegrität hat ihren Preis. Aber man kann handeln!

Zukunft

Rados Gateway

Ceph Caching Tier

SSD based Journaling

10GBit/s Networking

Zum Schluss

Folien bei Slidesharehttp://www.slideshare.net/dschneller

Handout bei CenterDevicehttps://public.centerdevice.de/399612bf-ce31-489f-bd58-04e8d030be52

@drivebytesting @dschneller

EndeDaniel Schneller daniel.schneller@centerdevice.de @dschneller

Recommended