Big Data Science in the Cloud from Big Data World Conference 2013

Preview:

DESCRIPTION

 

Citation preview

„Big Data Science in the Cloud“

Markus Schmidberger

Big Data Analyst & Cloud Engineer

@cloudHPCmarkus@mongosoup.de

Big Data gets Political

● New coalition agreement in Germany:– “Wir wollen die Informations- und Kommunikations-

Strategie (IKT-Strategie) für die digitale Wirtschaft weiterentwickeln. ...

– ... Wir werden die Forschungs- und Innovationsförderung für „Big Data“ auf die Entwicklung von Methoden und Werkzeugen zur Datenanalyse ausrichten ... “

3. December 2013 - 3

Continuos Software delivery

“We change the rules!”

Curios, playful, agile, experienced, goal-oriented, love to detail, thinking differently ...

Big data &polyglot persistence

Lean & agile

3. December 2013 - 4

Customer and Partners

3. December 2013 - 5

Big Data

3. December 2013 - 6

Big Data Science

● Data science seeks to use all available and relevant data to effectively tell a story that can be easily understood by non-practitioners.

3. December 2013 - 7

Cloud Computing

● Wikipedia: “... describes a variety of computing concepts that involve a large number of computers connected through a real-time communication network such as the Internet. ...”

3. December 2013 - 8

1) Put Apps & Data to best Place

3. December 2013 - 9

AWS Zones at the right Place

3. December 2013 - 10

Example: R and RStudio Server

● R: open-source statistical Software– www.r-project.org

● RStudio IDE– www.rstudio.org– IDE + web / server

version

3. December 2013 - 11

2) Choose Cloud Resources carefully

● Instance type● EBS optimized● EBS provisioned

IOPS● Load Balancer● Availability Zones

http://media.amazonwebservices.com/AWS_NoSQL_MongoDB.pdf

3. December 2013 - 12

● MongoDB hosting on Amazon EC2 (eu-west-1) and in Munich● 24x7 monitoring and support● Dedicated instances and shared hosting available● Replica Sets and Sharding available● SSL-enabled MongoDB

MongoSoup is the first German-based MongoDB cloud hosting solution!

Supported by a team of experts from MongoDB Inc. first German partner comSysto. You can have a running MongoDB database in virtually no time.

3. December 2013 - 13

Performance <-> Costs

● scale up & out● scale down ?● monitor your resources

from the beginning

3. December 2013 - 14

3) Use full Cloud Technology Stack

3. December 2013 - 15

Example: AWS EMR with mapR

● Speed● Compression

– reduces disk and network I/O and increases performance

● Snapshots– data protection

3. December 2013 - 16

4) Data Protection

● talk to the experts (e.g. Bitkom)

● use available mechanisms & services– EMR in VPC– Mongosoup.de

● be aware of the topic

3. December 2013 - 17

More Big Data Events

● “Map-Reducing Everywhere”– https://hadoopsummit.uservoice.co

m

● Forum Big Data und Verantwortung u.a. mit Frank Schirrmacher– Di, 03.12. 19:00; Große Aula LMU

3. December 2013 - 18

„Big Data Science in the Cloud“

- Yes We Can -

@cloudHPCmarkus@mongosoup.de

http://comsysto.com/events

Recommended