26
Olivia Klose | Technical Evangelist, Microsoft @oliviaklose blogs.technet.com/ oliviaklose

Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Olivia Klose | Technical Evangelist, Microsoft

@oliviaklose

blogs.technet.com/oliviaklose

Page 2: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Meet Olivia | @oliviaklose

• Microsoft Technical Evangelist– Fokus: Big Data, Hadoop, Hive, etc.

• Machine Learning– Informatik mit Mathematik an der University of Cambridge, TU

München und dem IIT Bombay

– Medizinische Bildgebung

– Nuklearmedizinische Klinik in München

• IT Erfahrungen in Großunternehmen

Page 3: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Agenda

Modul Inhalt

1 Intro & Big Data Buzzwords

- Big Data, Hadoop, MapReduce, HDInsight

2 Big Data Szenario: Twitter-Analyse

3 Manage: Daten extrahieren und speichern- Windows Azure Blob Storage, Windows Azure SQL Database, VM

4 Analyse: Daten analysieren

- HDInsight, Hive

5 Insights: Erkenntnisse aus Daten gewinnen

- ODBC Treiber, PowerPivot & PowerView

Page 4: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Modul 1

Intro & Big Data Buzzwords

• Big Data

• Hadoop

• MapReduce

• HDInsight

Page 5: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Was ist Big Data?

Modul 1 – Intro & Big Data Buzzwords

Page 6: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:
Page 7: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Der Large Hadron Collider

(Teilchenbeschleuniger am CERN)

produziert 15 PB/Jahr

http://home.web.cern.ch/about/computing

Page 8: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Aber was, wenn ich keinen

Large Hadron Collider besitze…

Page 9: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Großfabrik

Fuhrpark

Smart Grids

Ökostrom

Aktienbörse

Host Protocols

Rechenzentren

Serverfarm

Twitter

Facebook

Google Analytics

Vielleicht Daten von…

Page 10: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

“Big data is a term describing

the storage and analysis of

large and/or complex data sets

using a series of techniques

including, but not limited to:

NoSQL, MapReduce and machine learning.”

http://www.technologyreview.com/view/519851/the-big-data-conundrum-how-to-define-it/

arxiv.org/abs/1309.5821

Page 11: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

“Big data is high-volume,

high-velocity and/or

high-variety information assets

that require new forms of

processing to enable

enhanced decision making,

insight discovery and

process optimization.”

Gartner ‘s Definition of Big Data

Laney, Douglas. The Importance of “Big Data”: A Definition. Gartner. Abgerufen 21. Juni 2012.

Page 12: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Die 3 Vs

MB

GB

TB

PB

batch

periodic

real

time

table

data

base

un-

struc-

tured

web

Big Data, Gesellschaft für Informatik, 2013,http://www.gi.de/service/informatiklexikon/

detailansicht/article/big-data.html

Page 13: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

In eigenen Worten…

Big Data umfasst

große und unstrukturierte

Datenvolumen aus

unterschiedlichen Datenquellen,

die in kürzester Zeit erzeugt

und analysiert werden.

Page 14: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Was ist Hadoop?

Modul 1 – Intro & Big Data Buzzwords

Page 16: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Historie

2002 2004 2006

Nutch

Doug Cutting | New York Times, 16 March 2009,

http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html

Page 17: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Historie

2002 2004 2006

Nutch

GFS NDFS

Doug Cutting | New York Times, 16 March 2009,

http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html

Page 18: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Historie

2002 2004 2006

Nutch

GFS NDFS

MapReduceNutch

MapReduce

Doug Cutting | New York Times, 16 March 2009,

http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html

Page 19: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Historie

2002 2004 2006

Nutch

GFS NDFS

MapReduceNutch

MapReduce Hadoop

Doug Cutting | New York Times, 16 March 2009,

http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html

Page 20: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Hadoop Komponenten

Page 21: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

MapReduce

Page 22: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Was ist HDInsight?

Modul 1 – Intro & Big Data Buzzwords

Page 23: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

HDInsight

Page 24: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

LegendRed = Core HadoopBlue = Data processingGreen = PackagesDark blue = Microsoft integration points and value addsOrange = Data Movement

HDInsight / Hadoop architecture

Distributed Storage

(HDFS)

Distributed Processing

(MapReduce)

Page 25: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

Agenda

Modul Inhalt

1 Intro & Big Data Buzzwords

- Big Data, Hadoop, MapReduce, HDInsight

2 Big Data Szenario: Twitter-Analyse

3 Manage: Daten extrahieren und speichern- Windows Azure Blob Storage, Windows Azure SQL Database, VM

4 Analyse: Daten analysieren

- HDInsight, Hive

5 Insights: Erkenntnisse aus Daten gewinnen

- ODBC Treiber, PowerPivot & PowerView

Page 26: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:

©2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Office, Azure, System Center, Dynamics and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.