33
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN SQL Server 2012 Parallel Data Warehouse Meinrad Weiss Principal Consultant Trivadis AG

SQL Server 2012 Parallel Data Warehouse

  • Upload
    loman

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

SQL Server 2012 Parallel Data Warehouse . Meinrad Weiss Principal Consultant Trivadis AG. Trivadis solution portfolio and competences. CUSTOMER. IT SOLUTIONS, SERVICES & PRODUCTS. Business departments. Business Integration Services. Business Intelligence. Infrastructure Engineering. - PowerPoint PPT Presentation

Citation preview

Page 1: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

SQL Server 2012 Parallel Data Warehouse

Meinrad WeissPrincipal ConsultantTrivadis AG

Page 2: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Trivadis solution portfolio and competences

Mai 2013SQL Server 2012 Parallel Data Warehouse

2

IT SOLUTIONS, SERVICES & PRODUCTS

TECHNOLOGIESOracle, Microsoft, IBM, Open Source

Integration, Application Performance Management, Security

InfrastructureEngineering

Application Development

ITdepartment

s

Businessdepartment

s

CUSTOMER

Business Intelligence

Business Integration Services

Training

Managed Services

Page 3: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Trivadis facts & figures

dateTrivadis – the company

3

11 Trivadis locations with more than 600 employees

Financially independent and sustainably profitable

Key figures 2011

Revenue CHF 104 / EUR 84 Mio.

Services for more than 800 clients in over 1,900 projects

200 Service Level Agreements

More than 4,000 training participants

Research and development budget: CHF 5.0 / EUR 4 Mio.

Hamburg

Dusseldorf

Frankfurt

Stuttgart

MunichFreiburg

Vienna

Basel

Bern

Zurich

Lausanne~380 employees

~200 employees

~30 employees

Mai 2013

3SQL Server 2012 Parallel Data Warehouse

Page 4: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Visit us, win a price and be prepared for your next adventure

SQL Server 2012 Parallel Data Warehouse Mai 2013

4

Page 5: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Agenda

1. Positioning Parallel Data Warehouse (PDW)2. Architecture3. Working with PDW4. PDW and Big Data

SQL Server 2012 Parallel Data Warehouse Mai 2013

5

Page 6: SQL Server 2012 Parallel Data Warehouse

2012 © TrivadisSQL Server 2012 Parallel Data Warehouse Mai 2013

6

Page 7: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Data Warehouse – Products Positioning

1

2

Balanced solution for mostly scan-centric workloads.Max HW tune-up for most DW scenarios.

3 Most flexible architecture for handling all DW scenarios.

ScaleComplexityHA by defaultSW-HW integration

SQL Server 2012Fast Track

SQL Server 2012

SQL Server 2012PDW

1

2

3

PDW with Distributed

Data Architecture

SQL Server 2012 Parallel Data Warehouse Mai 2013

7

Page 8: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Built For Big DataNext-generation Performance At Scale

Scale-Out Architecture on Industry Standard

Hardware

SQL Server 2012 Parallel Data Warehouse

SQL Server 2012 Parallel Data Warehouse Mai 2013

8

Page 9: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Agenda

1. Positioning Parallel Data Warehouse (PDW)2. Architecture3. Working with PDW4. PDW and Big Data

SQL Server 2012 Parallel Data Warehouse Mai 2013

9

Page 10: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Parallel Data Warehouse at a glance

Shared-nothing parallel database system Massively parallel processing (MPP) A “Control” server that accepts user queries,

generates a plan, and distributes operations in parallel to compute nodes

Multiple “Compute” servers running SQL ServerDelivered as an appliance Balanced and pre-configured hardware and software Scales from 2 to 56 Nodes Fastest Time to Market

SQL Server 2012 Parallel Data Warehouse Mai 2013

10

Page 11: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Base Unit (1/4 Rack)

All hosts and VM’s run Windows Server 2012 Standard All Fabric and workload

activity happens in Hyper-V virtual machines, with Fabric VM’s sharing 1 server

Failover is handled by Hyper-V

PDW Agent runs on all hosts and all VMs, collects appliance health data on fabric and workload

Windows Storage Spaces handles mirroring and spares

Host 1

Host 0

Host 2

Host 3

Storage Spaces

IB &Ethernet Direct attached

SAS

Control FAB AD

Compute 1

Compute 2

SQL Server 2012 Parallel Data Warehouse Mai 2013

11

VMM

Page 12: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Current Limitations: Performance and Scale Up today

Scale Up (and pay)

Old server will be obsolete if bigger system is required

$$$ $$$ $$$

Mai 2013

12SQL Server 2012 Parallel Data Warehouse

? ?

Page 13: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

SCALING FROM 2 TO 56 NODES

Appliance can grow with increasing workload

Provisioning consists of 3 phases:

Bare metal provisioning of new nodes

Provisioning of workload VMs and ‘hooking up’ to other workload VMs

Redistribution of data

Host 1

Host 0

Host 2

Host 3

Storage Spaces

IB &Ethernet Direct attached

SAS

Control FAB AD

Compute 1

Compute 2

Host 2

Host 3

Storage Spaces

Compute 1

Compute 2

SQL Server 2012 Parallel Data Warehouse Mai 2013

13

VMM

Page 14: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis4 8 12 16 20 24 28 32 36 40SQL Server Compute Nodes

Sys

tem

Thr

ough

put

Regular SQL Server

( 1 Node)

Seamless Scalability

5 Nodes

10 Nodes

20 Nodes

30 Nodes

30 Nodes

Mai 2013

14SQL Server 2012 Parallel Data Warehouse

Page 15: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

High Availability

Storage Spaces manages the physical disks in the disk enclosures

Failover: One cluster across the whole

appliance VMs are automatically

migrated on host failure Affinity and anti-affinity maps

enforce rules

Host 1

Host 0

Host 2

Host 3

Storage Spaces

IB &Ethernet Direct attached

SAS

Control FAB AD

Compute 1

Compute 2

SQL Server 2012 Parallel Data Warehouse Mai 2013

15

VMM

Page 16: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Agenda

1. Positioning Parallel Data Warehouse (PDW)2. Architecture3. Working with PDW4. Introduction to Big Data5. Microsoft and Big Data

SQL Server 2012 Parallel Data Warehouse Mai 2013

16

Page 17: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Distribution and Replication of Data: Replicate

Time DimDate Dim IDCalendar YearCalendar QtrCalendar MoCalendar Day

Store DimStore Dim ID

Store NameStore MgrStore Size

Product DimProd Dim ID

Prod CategoryProd Sub CatProd Desc

MktgCampaign DimMktg Camp IDCamp NameCamp MgrCamp StartCamp End

SQL

SQL

SQL

SQL

TD

PD

SD

MD

TD

PD

SD

MD

TD

PD

SD

MD

Smaller (<5GB ) Dimension Tables are Replicated on Every

Compute NodeTD

PD

SD

MD

Sales Facts

Date Dim IDStore Dim IDProd Dim IDMktg Camp IdQty SoldDollars Sold

Result: Fact -Dimension Joins can be performed

locally

SQL Server 2012 Parallel Data Warehouse Mai 2013

17

Page 18: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Create Replicated TableCREATE TABLE myTable   (    id int NOT NULL,    lastName varchar(20),    zipCode varchar(6)  )WITH (CLUSTERED COLUMNSTORE INDEX);

Creates tables on each of the individual compute nodes and assigns them to the REPLICATED file group.

Data Compression is automatically turned on

SQL Server 2012 Parallel Data Warehouse Mai 2013

18

Page 19: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

TD

PD

SD

MD

TD

PD

SD

MD

TD

PD

SD

MD

TD

PD

SD

MD

Distribution and Replication of Data: Distribute

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1

SF-1SQL

SQL

SQL

SQL

Larger (> 10 GB) Fact Table is Hash

Distributed Across All Compute Nodes

SF-1

SF-2

SF-3

SF-4

Time DimDate Dim IDCalendar YearCalendar QtrCalendar MoCalendar Day

Store DimStore Dim ID

Store NameStore MgrStore Size

Product DimProd Dim ID

Prod CategoryProd Sub CatProd Desc

MktgCampaign DimMktg Camp IDCamp NameCamp MgrCamp StartCamp End

Sales Facts

Date Dim IDStore Dim IDProd Dim IDMktg Camp IdQty SoldDollars Sold

SQL Server 2012 Parallel Data Warehouse Mai 2013

19

Page 20: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Distribution on a PDW

PDW Node 1Create Table <myTable GUID>_aCreate Table <myTable GUID>_b…Create Table <myTable GUID>_h

8 Tables per Node

PDW Node 2Create Table <myTable GUID>_aCreate Table <myTable GUID>_b…Create Table <myTable GUID>_h

PDW Node 8Create Table <myTable GUID>_aCreate Table <myTable GUID>_b…Create Table <myTable GUID>_h

PDW Node …

Final Result:64 individual tables across a 8 node (1 data rack HP) appliance

CREATE TABLE myTable (column Defs)WITH (DISTRIBUTION = HASH (id));

Mai 2013

20SQL Server 2012 Parallel Data Warehouse

Page 21: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

xVelocity gives next-gen performanceLightning Fast Data Query Processing

Custom

er

Sales

Country

Supplier

Products

Columnstore Provides Dramatic Performance• Updateable and clustered xVelocity

columnstore• Stores data in columnar format• Memory-optimized for next-generation

performance• Updateable to support bulk and/or trickle

loadingSave Timeand Costs

Real-TimeDW

Up to50X Faster

Up to 15x compression

Mai 2013

21SQL Server 2012 Parallel Data Warehouse

Page 22: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Web-Based Management Dashboard

SQL Server 2012 Parallel Data Warehouse Mai 2013

22

Page 23: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Web-Based Management Dashboard (2)

SQL Server 2012 Parallel Data Warehouse Mai 2013

23

Page 24: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Web-Based Management Dashboard (3)

SQL Server 2012 Parallel Data Warehouse Mai 2013

24

Page 25: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

PDW Querying 1 Petabyte of data in 1 second

Mai 2013

25SQL Server 2012 Parallel Data Warehouse

294‘000‘000‘000 rows

http://www.sqlpass.org/summit/2012/DayOneKeynote.aspx

Page 26: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Reference Case: Today’s process flow / Building blocks

DB_GSAPOP

DB_MasterTables

DB_ReportTables

FinanceCube

Baseline : Once data extracted from SAP:Time taken to create end-end Reports and Cubes insights 13+ hours (In production typical 20+ hours with multiple companies)

DW_FinanceTransactions

MasterFinance table population

6 hours 21min

6 hours 1 hour

Suspicious words

Reports

3hr21min

SAP

Mai 2013

26SQL Server 2012 Parallel Data Warehouse

Page 27: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Reference Case: Audit Process with PDW

DB_GSAPOP

DB_MasterTables

DB_ReportTables

FinanceCube

Once data is extracted from SAP:Creating 5 CM Reports & FSCP Finance Cube;Time taken: 30 Minutes

DW_FinanceTransactions

MasterFinance table population 8m50sec

load from FlatFile23min

10m10sec 11 min

All 5 Reportswithin 6min

(80) (80) (80)

(80)

SAP

Mai 2013

27SQL Server 2012 Parallel Data Warehouse

Page 28: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Agenda

1. Positioning Parallel Data Warehouse (PDW)2. Architecture3. Working with PDW4. PDW and Big Data

SQL Server 2012 Parallel Data Warehouse Mai 2013

28

Page 29: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Introducing PolyBaseSeamless integration of two worlds

Single Query; Structured and Unstructured• Query and join Hadoop tables

with Relational Tables• Use Standard SQL language • Select, From Where

Existing SQLSkillset

Save Timeand CostsDatabase HDFS

(Hadoop)

SQL Server 2012 PDW Powered by PolyBase

SQL

Analyze AllData Types

Mai 2013

29SQL Server 2012 Parallel Data Warehouse

Page 30: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Additional Resources

SQL Server Parallel Data Warehouse (PDW) Landing Page:

www.microsoft.com/PDWIntroduction to Polybase:

http://www.microsoft.com/en-us/sqlserver/solutions-technologies/data-warehousing/polybase.aspx

Price/TB comparison:http://

www.valueprism.com/resources/resources/Resources/PDW%20Compete%20Pricing%20FINAL.pdf

HP QuickSpecshttp://

h18000.www1.hp.com/products/quickspecs/13830_div/13830_div.html

http://h18000.www1.hp.com/products/quickspecs/13830_div/13830_div.pdf

Mai 2013

30SQL Server 2012 Parallel Data Warehouse

Page 31: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Conclusion

SQL Server 2012 Parallel Data Warehouse is an MPP based appliance optimized for data warehouse workload Very Similar to a regular SQL Server,

but T-SQL not 100 % identical Support for development via SQL Server Data Tools

Scalable, balanced platform Not just storage (data + CPU power)

Central part of Microsoft BI stack Well integrated in Microsoft management tools

SQL Server 2012 Parallel Data Warehouse Mai 2013

31

Page 32: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

Visit us, win a price and be prepared for your next adventure

SQL Server 2012 Parallel Data Warehouse Mai 2013

32

Page 33: SQL Server 2012 Parallel Data Warehouse

2012 © Trivadis

BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

Let‘s go.Wettbewerb