26
The Elepha nt in the Cloud Putting Hadoop on Any Cloud @natishalom

Putting hadoop on any cloud big data spain

Embed Size (px)

DESCRIPTION

The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. Now more than ever, there is a growing number of choices of cloud infrastructure providers, from Amazon AWS, OpenStack offered by the likes of HP, Rackspace and soon even Dell, VMware vCloud as well a... INCLUDING - Effectively managing your Hadoop stack in any data center (on-premise, cloud, hybrid…) - Maintaining the flexibility to choose the right cloud for the job in an ever-changing environment - Consistently manage your hadoop deployment with other elements of your Big Data system such as NoSQL DB, Web Tier etc.

Citation preview

Page 1: Putting hadoop on any cloud  big data spain

The Elephant

in the Cloud

Putting Hadoop on Any Cloud

@natishalom

Page 2: Putting hadoop on any cloud  big data spain

Columbus & The Cloud

THE DISCOVERY OF AMERICA THE THING THAT MADE IT POSSIBLE

Page 3: Putting hadoop on any cloud  big data spain

Why Cloud Portability

Matters

Page 4: Putting hadoop on any cloud  big data spain

Cloud Portability Myth #1

No one really needs cloud portability

Page 5: Putting hadoop on any cloud  big data spain

Cloud Portability

Facts

Zynga moved ~80% of their workload from Amazon to their private zCloud

“own the base, rent the spike”

http://code.zynga.com/2012/02/the-evolution-of-zcloud/

Page 6: Putting hadoop on any cloud  big data spain

Cloud Portability

Facts Started with Linode, then moved to RackSpace, then to AWS

http://code.mixpanel.com/2010/11/08/amazon-vs-rackspace/

Page 7: Putting hadoop on any cloud  big data spain

Cloud Portability

Facts

• You want the flexibility to choose what’s right for you, when it’s right for you

• Based on pricing, features, availability, performance, etc.

Page 8: Putting hadoop on any cloud  big data spain

Cloud Portability Myth #2

Cloud Portability ==

Cloud API Standardization

Page 9: Putting hadoop on any cloud  big data spain

Cloud APIs, Today

Standard APIs (?)OCCIVCloud

OSS FrameworksOpenStackCloudStackEucalyptus

Abstraction frameworksJCloudsDeltacloudFogLibvirt

Page 10: Putting hadoop on any cloud  big data spain

Cloud APIs, Today

Standard APIsNot practical in the foreseeable future

OSS Projects Need a couple more years to converge &

mature

Abstraction FrameworksProbably the only

practical (near-term) option

Page 11: Putting hadoop on any cloud  big data spain

Realization:

What You Really Care

about Is App

Portability

OS is the same on any cloud

Most clouds have compute & storage

Elasticity & scaling have same effects on the app, regardless of the cloud

Page 12: Putting hadoop on any cloud  big data spain

Cloud Portability Myth #3 All infrastructure

clouds were born equal

Page 13: Putting hadoop on any cloud  big data spain

Food for Thought

Offerings can vary quite a bit:

• Amazon guarantees only 99.5% uptime

• RackSpace will give you $$$ every time they crash

• Joyent claims to be significantly faster than both

Page 14: Putting hadoop on any cloud  big data spain

And Some Features Are

Unique…

Amazon the only major vendor to offer SSD storage. Netflix says it’s:

• ½ the price for the same throughput

• ⅕ the latency on avg.

• Even slowest requests are 6x faster

http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html

Page 15: Putting hadoop on any cloud  big data spain

Let’s Talk Big Data on the Cloud

Page 16: Putting hadoop on any cloud  big data spain

A Typical Big Data App…

Page 17: Putting hadoop on any cloud  big data spain

Managing Big Data on the

Cloud

• Auto start VMs• Install and configure

app components • Monitor • Repair • (Auto) Scale• Burst…

Page 18: Putting hadoop on any cloud  big data spain

The Challenges ..

Consistent Management

Making the deployment, installation, scaling, fail-over looks the same through the entire stack

Page 19: Putting hadoop on any cloud  big data spain

The Challenges (Cont)..

Cloud Portability

Choosing the Right Cloud for the Job

Running Bare-Metal for high I/O workload, Public cloud for sporadic workloads..

Page 20: Putting hadoop on any cloud  big data spain

Hadoop

• Available under different distributions

• Cloudera• IBM BigInsights• MapR• Hortonworks

Page 21: Putting hadoop on any cloud  big data spain

Big Data Apps, on Any Cloud, Your Way

Open source (Apache2)

Page 22: Putting hadoop on any cloud  big data spain

Putting Cloudify and

Hadoop Together

• Run on Any Cloud• Consistent MGT• Dynamic Scaling • Auto Recovery• Auto Scaling• Role Assignments • Monitoring• Simple maintenance

Page 23: Putting hadoop on any cloud  big data spain

How it works..1 Upload your recipe.

2 Cloudify creates VM’s & installs agents

3 Agents install and manage your app

4 Cloudify automate the scaling

Page 24: Putting hadoop on any cloud  big data spain

Few Snippets..