Big$Data$Processing$using$ Hadoop$ - prace.it4i.czprace.it4i.cz/sites/prace.it4i.cz/files/files/hadoop-10-2015... · Original Hadoop distributed grep Hadoop+BlobSeer sort Execution

Big Data Processing using Hadoop

Shadi Ibrahim Inria, Rennes -‐ Bretagne Atlantique Research Center

Apache Hadoop

2 Hadoop – INRIA S.IBRAHIM 2

Hadoop •  Hadoop is a top-‐level Apache project » Open source implementation of MapReduce » Developed in Java

•  It was advocated by industry’s premier Web players—Google, Yahoo!, Microsoft, and Facebook—as the engine to power the cloud.

Hadoop – INRIA S.IBRAHIM 3

Hadoop History


Who Uses Hadoop? » Facebook » IBM : Blue Cloud » Joost » Last.fm » New York Times » Microsoft » Cloudera » Yahoo!

“The mapreduce programming model and implementations”, H Jin, S Ibrahim, L Qi, H Cao, S Wu, X Shi. Cloud computing: principles and paradigms. Wiley,2011


Hadoop: Where? Batch data processing, not real-‐time

web search

Log processing

Document analysis and indexing

Web graphs and crawling

Highly parallel data intensive distributed applications


Public Hadoop Clouds Hadoop Map-‐reduce on Amazon EC2 http://wiki.apache.org/hadoop/AmazonEC2

IBM Blue Cloud » Partnering with Google to offer web-‐scale infrastructure

Global Cloud Computing Testbed » Joint effort by Yahoo, HP and Intel – http://www.opencloudconsortium.org/testbed.html

7 Hadoop – INRIA S.IBRAHIM

8 Big Data Processing in the Cloud– INRIA S.IBRAHIM


Source: Wikibon 2013

Batch Big data Processing

Adapted from Presentations from: http://wiki.apache.org/hadoop/HadoopPresentations


Components Distributed File System » Name: HDFS »  Inspired by GFS

Distributed Processing Framework » Modelled on the MapReduce abstraction

http://wiki.apache.org/hadoop/HadoopPresentations


HDFS Distributed storage system »  Files are divided into large blocks (128 MB) »  Blocks are distributed across the cluster »  Blocks are replicated to help against hardware failure »  Data placement is exposed so that computation can be migrated to data

Notable differences from mainstream DFS work »  Single ‘storage + compute’ cluster vs. separate clusters »  Simple I/O centric API



HDFS Architecture: NameNode (1) Master-‐Slave Architecture

HDFS Master “NameNode” »  Manages all file system metadata in memory

•  List of files •  For each file name, a set of blocks •  For each block, a set of DataNodes •  File attributes (creation time, replication factor)

»  Controls read/write access to files »  Manages block replication »  Transaction log: register file creation, deletion, etc.



HDFS Architecture: DataNodes (2) HDFS Slaves “DataNodes”

A DataNode is a block server »  Stores data in the local file system (e.g. ext3) »  Stores meta-‐data of a block (e.g. CRC) »  Serves data and meta-‐data to Clients

Block Report »  Periodically sends a report of all existing blocks to the NameNode

Pipelining of Data »  Forwards data to other specified DataNodes

Perform replication tasks upon instruction by NameNode

Rack-‐aware http://wiki.apache.org/hadoop/HadoopPresentations


HDFS Architecture (3)



Fault Tolerance in HDFS DataNodes send heartbeats to the NameNode » Once every 3 seconds

NameNode uses heartbeats to detect DataNode failures » Chooses new DataNodes for new replicas » Balances disk usage » Balances communication traffic to DataNodes



NameNode Failures A single point of failure

Transaction Log stored in multiple directories » A directory on the local file system » A directory on a remote file system (NFS/CIFS)

Need to develop a real HA solution



Data Pipelining Client retrieves a list of DataNodes on which to place replicas of a block » Client writes block to the first DataNode » The first DataNode forwards the data to the next » The second DataNode forwards the data to the next

DataNode in the Pipeline » When all replicas are written, the client moves on to write the next block in file



User Interface Command for HDFS User: » hadoop dfs -‐mkdir /foodir » hadoop dfs -‐cat /foodir/myfile.txt » hadoop dfs -‐rm /foodir myfile.txt

Command for HDFS Administrator » hadoop dfsadmin -‐report » hadoop dfsadmin -‐decommission datanodename

Web Interface » http://host:port/dfshealth.jsp

http://wiki.apache.org/hadoop/HadoopPresentations Hadoop – INRIA S.IBRAHIM 19

Useful links HDFS Design: » http://hadoop.apache.org/core/docs/current/hdfs_design.html

Hadoop API: » http://hadoop.apache.org/core/docs/current/api/



Hadoop MapReduce Master-‐Slave architecture

• Map-‐Reduce Master “JobTracker”

» Accepts MR jobs submitted by users

» Assigns Map and Reduce tasks to TaskTrackers

» Monitors task and TaskTracker status, re-‐executes

tasks upon failure



Hadoop MapReduce Master-‐Slave architecture

• Map-‐Reduce Slaves “TaskTrackers” » Run Map and Reduce tasks upon instruction from the JobTracker

• Manage storage and transmission of intermediate output » Generic Reusable Framework supporting pluggable user code (file system, I/O format)



Deployment: HDFS + MR



Hadoop MapReduce Client Define Mapper and Reducer classes and a “launching” program

Language support » Java » C++ » Python



Zoom on Map Phase

25

“Handling partitioning skew in MapReduce using LEEN” S Ibrahim, H Jin, L Lu, B He, G Antoniu, S Wu -‐ Peer-‐to-‐Peer Networking and Applications, 2013


Zoom on Reduce Phase


Data Locality Data Locality is exposed in the Map Task scheduling

Data are Replicated: »  Fault tolerance »  Performance : divide the work among nodes

Job Tracker schedules map tasks considering: »  Node-‐aware »  Rack-‐aware »  non-‐local map Tasks


Fault-‐tolerance TaskTrackers send heartbeats to the Job Tracker »  Once every 3 seconds

TaskTracker uses heartbeats to detect »  Node is labled as failed If no heartbeat is recieved for a defined expiry time

(Defualt : 10 Minutes)

Re-‐execute all the ongoing and complete tasks

Need to develop a more efficient policy to prevent re-‐executing completed tasks (storing this data in HDFS)


Speculation in Hadoop •  Nodes slow (stragglers) à run backup tasks

Node 1

Node 2

Adopted from a presentation by Matei Zaharia “Improving MapReduce Performance in Heterogeneous Environments”, OSDI 2008, San Diego, CA, December 2008.


Life cycle of Map/Reduce Task


Hadoop ecosystem


Hadoop ecosystem •  Core: Hadoop MapReduce and HDFS

•  Data Access: Hbase, Pig, Hive

•  Algorithms: Mahout

•  Data Import: Flume, Sqoop and Nutch


Hadoop ecosystem

Hadoop – INRIA S.IBRAHIM

•  Database primitives

•  Hbase » Wide column data structure built on HDFS

•  Processing •  Pig » A high level data-‐flow language and execution framework for parallel computatuon

33

Hadoop ecosystem


•  Algorithm

•  Mahout » Scalable machine learning data mining » Runs on the tip of Hadoop

•  Processing •  Hive » Data Warehouse Infrastructure » Support for custom mappers and reducers for more sophisticated analysis

34

Hadoop ecosystem


•  Data Import

•  Sqoop » Structured Data » Import and Export from RDBMS to HDFS

•  Data Import

•  Flume » Import streams » Text files and systems logs

35

Hadoop ecosystem


•  Coordination •  ZooKeeper » A high-‐performance coordination service for distributed applications.

•  Scheduling •  Oozie » A workflow scheduler system to manage Apache Hadoop jobs

36

Job Scheduling in Hadoop


Why Jobs Scheduling in Hadoop? MapReduce / Hadoop originally designed for high throughput

batch processing

•  Today’s workload is far more diverse: » Many users want to share a cluster

•  Engineering, marketing, business intelligence, etc »  Vast majority of jobs are short

•  Ad-‐hoc queries, sampling, periodic reports »  Response time is critical

•  Interactive queries, deadline-‐driven reports

•  How can we efficiently share MapReduce clusters between users?


Job Scheduling in Hadoop • Assigning jobs to Hadoop Cluster

• Considerations » Job priority (e.g., deadline) » Capacity

•  Cluster resources available •  resources needed for job


Job Scheduling in Hadoop •  FIFO » The first job to arrive at a JobTracker is processed first

•  Capacity Scheduling » Organizes jobs into queues » Queue shares as %’s of cluster

•  Fair Scheduler » Group jobs into “pools” » Divide excess capacity evenly between pools » Delay scheduler to expose data locality


Hadoop Optimizations


Open Issues – Heavy access All the metadata are handled through one single Master in HDFS (Name node)

•  Perform bad when » Handling Many files » Heavy concurrency


The BlobSeer Approach to Concurrency-‐Optimized Data Management

BlobSeer: software platform for scalable, distributed BLOB management »  Decentralized data storage »  Decentralized metadata management »  Versioning-‐based concurrency control »  Lock-‐free concurrent writes (enabled by versioning)

A back-‐end for higher-‐level data management systems »  Short term: highly scalable distributed file systems »  Middle term: storage for cloud services »  Long term: extremely large distributed databases

http://blobseer.gforge.inria.fr/


Highlight: BlobSeer Does Better Than Hadoop! Original Hadoop

Hadoop+BlobSeer distributed grep sort

Execution time reduced by up to 38%

MapReduce: a natural application class for BlobSeer

• Journal of Parallel and Distributed Computing (2011), IPDPS 2010, MAPREDUCE 2010 (HPDC)


Open Issues – Data locality •  Data locality is crucial for Hadoop’s performance

•  How can we expose data-‐locality of Hadoop in the Cloud efficiently?

•  Hadoop in the Cloud » Unaware of network topology » Node-‐aware or non-‐local map tasks


Open Issues – Data locality Data locality in the Cloud

The simplicity of Map tasks Scheduling leads to

Non-‐local maps execution (25%)

Node1

Node2

Node3

Node5

Node4

Node6

3 1 10 5 1 3 9 7 6 4 5 6 7 10 8 2 2 4

9 12 12 8 11 11

Empty node Empty node Empty node


Side impacts: •  Increase the execution time

•  Increase the number of useless speculation

• Slot occupying » Imbalance in the Map execution among nodes


Maestro: Replica-‐Aware scheduling in Hadoop Schedule the map tasks in two waves:

First wave: fills the empty slots of each data node based on the number of hosted map tasks and on the replication scheme for their input data

Second wave: runtime scheduling takes into account the probability of scheduling a map task on a given machine depending on the replicas of the task’s input data.

Results: Maestro can achieve optimal data locality even if data are not uniformly distributed among nodes and improve the performance of Hadoop applications


Results •  Sort application

Local maps

Virtual Cluster

Native Hadoop

83%

Maestro 96%

Grid5000 Native Hadoop

84%

Maestro 97%

S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu, S. Wu, Maestro: Replica-‐aware map scheduling for mapreduce, in: CCGrid 2012 Hadoop – INRIA S.IBRAHIM 49

Open Issues – Data Skew Data Skew

•  The current Hadoop’s hash partitioning works well when the keys are equally appeared and uniformly stored in the data nodes

•  In the presence of Partitioning Skew: »  Variation in Intermediate Keys’ frequencies »  Variation in Intermediate Key’s distribution amongst different data node

•  Native blindly hash-‐partitioning is to be inadequate and will lead to: »  Network congestion »  Unfairness in reducers’ inputs ? Reduce computation Skew »  Performance degradation


Open Issues – Data Skew Data Node1

K1

hash (Hash code (Intermediate-‐key) Modulo ReduceID)

K1 K1 K2 K2 K2

K2 K2 K3 K3 K3 K3

K4 K4 K4 K4 K5 K6

Data Node2 K1 K1 K1 K1 K1 K1

K1 K1 K1 K2 K4 K4

K4 K5 K5 K6 K6 K6

Data Node3 K1 K1 K1 K2 K2

K2

K1

K4 K4 K4 K4 K4

K4 K5 K5 K5 K5 K5

K1 K2 K3 K4 K5 K6 K1 K2 K3 K4 K5 K6

K1 K1 K1 K2 K2 K2

K2 K2 K3 K3 K3 K3

K4 K4 K4 K4 K5 K6

K1 K1 K1 K1 K1 K1

K1 K1 K1 K2 K4 K4

K4 K5 K5 K6 K6 K6

K1 K1 K1 K2 K2

K2

K1

K4 K4 K4 K4 K4

K4 K5 K5 K5 K5 K5

Data Node1� Data Node2 � Data Node3�

Total Data Transfer � 11� 15 � 18� Total 44/54�

Reduce Input � 29� 17 � 8� cv 58%�


Example: Wordcount Example

01002003004005006007008009001000

Map Outpu

t

Red

uce Input

Map Outpu

t

Red

uce Input

Map Outpu

t

Reduce Input

Map Outpu

t

Red

uce Input

Map Outpu

t

Red

uce Input

Map Outpu

t

Reduce Input

DataNode01 DataNode02 DataNode03 DataNode04 DataNode05 DataNode06

Data Size (M

B)

Transferred Data Local Data Data During Failed Reduce

6-‐node, 2 GB data set!

Combine Function is disabled

Data Distribution

Max-‐Min Ratio

20%

cv 42%

Transferred Data is relatively Large

Data Distribution is Imbalanced

83% of the Maps output


LEEN Partitioning Algorithm •  Extend the Locality-‐aware concept to the Reduce Tasks

•  Consider fair distribution of reducers’ inputs

•  Results: »  Balanced distribution of reducers’ input » Minimize the data transfer during shuffle phase »  Improve the response time

Close To optimal tradeoff between Data Locality and reducers’ input Fairness

S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He, L. Qi, Leen: Locality/fairness-‐aware key partitioning for mapreduce in the cloud, in: CLOUDCOM’10


For K1

If (to N2 ) à 15 25 14 à Fairness-Score = 4.9 (to N3) à 15 9 30 à Fairness-Score = 8.8

LEEN details (Example) K1 � K2 � k3 � k4 � k5 � k6 �

Node1 � 3 � 5 � 4 � 4 � 1 � 1 � 18 �

Node2 � 9 � 1 � 0 � 3 � 2 � 3 � 18 �

Node3 � 4 � 3 � 0 � 6 � 5 � 0 � 18 �

Total � 16 � 9 � 4 � 13 � 8 � 4 �

54


K2 K2 K3 K3 K3 K3

K4 K4 K4 K4 K5 K6


K1 K1 K1 K2 K4 K4

K4 K5 K5 K6 K6 K6


K2 K4 K4 K4 K4 K4

K4 K5 K5 K5 K5 K5

K1 � K2 � k3 � k4 � k5 � k6 �

Node1 � 3 � 5 � 4 � 4 � 1 � 1 � 18 �

Node2 � 9 � 1 � 0 � 3 � 2 � 3 � 18 �

Node3 � 4 � 3 � 0 � 6 � 5 � 0 � 18 �

Total � 16 � 9 � 4 � 13 � 8 � 4 �

FLK � 4.66 2.93 1.88 2.70 2.71 1.66

N1 N2 N3 Data Node1

K1 K1 K1 K2 K2 K2

K2 K2 K3 K3 K3 K3

K4 K4 K4 K4 K5 K6


K1 K1 K1 K2 K4 K4

K4 K5 K5 K6 K6 K6


K2 K4 K4 K4 K4 K4

K4 K5 K5 K5 K5 K5

Data Transfer = 24/54 cv = 14%


Results 3 � 4 � 5 � 6 �

Locality Range � 1-85%� 15-35%� 1-50%� 1-30%�

S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He, L. Qi, Leen: Locality/fairness-‐aware key partitioning for mapreduce in the cloud, in: CLOUDCOM’10


Open Issues – Energy Efficency •  Data-‐centers consume large amount of energy » High monetary

•  Power bills become a substantial part of the total cost of ownership (TCO) of data-‐center (40% of the total cost)

» High carbon emission


Why Energy efficient Hadoop?

It is essential to explore and optimize the energy efficiency

when running Big Data application in Hadoop

THE engine for Big Data processing in data-‐centers

Power bills become a substantial part of the total cost of ownership (TCO) of data-‐centers


Energy-‐efficiency in Hadoop


DVFS in Hadoop?

59

There is a significant potential of energy saving by scaling down the CPU frequency when peak CPU is not

needed

Diversity of MapReduce applications

Multiple phases of MapReduce application

Disk I/O CPU Disk I/O Network

CPU load is high (98%)

during almost 75% of the job

running

CPU load is high(80%) during only

15% of the job running


DVFS governors Performance

Powersave OnDemand Conservative Userspace

Maximize Performance

Maximize power savings

Power efficiency with reasonable performance Support for user-‐defined frequencies

Static Dynamic Static

Highest available frequency

Lowest available frequency

Gradually degrades the CPU frequency when the load is low

Adjust to the highest available frequency when the load is high

Gradually upgrades the CPU frequency when the load is

high

High power consumption

Long response time

Low performance/power saving benefits

when the system switches between idle states and heavy load

often

Worse performance than Ondemand


Power Consumption

Pi application

40% Pi application

“Towards Efficient Power Management in MapReduce: Investigation of CPU-‐Frequencies Scaling on Power Efficiency in Hadoop”, S. Ibrahim et al, ARMS CC 2014.


Performance vs Energy Pi application

“Towards Efficient Power Management in MapReduce: Investigation of CPU-‐Frequencies Scaling on Power Efficiency in Hadoop”, S. Ibrahim et al, ARMS CC 2014.


Open Issue: Hadoop in Virtualized environment •  The applicability of MapReduce on Virtualized Cloud

……….

Virtual Machine

Security

………...

MapReduce

Virtual Machine

Various Implementations

Scalability

Load Balancing

Locality Optimization

Fault Tolerance

Simplicity

Check Pointing / Migration

Optimize Resource Utilization

Power Saving

Resource Abstraction

Performance Isolation

Simplicity

Green Data Center

Pay as you go with no lock-in

Reliable and Robustness

Abstraction layer for the hardware, software, and configuration of systems

Flexible Migration and Restart Capabilities

Dynamic Hardware and

Software Maintenance

Cloud Computing

Dynamic scaling


Physical vs Virtual Hadoop Cluster •  HDFS on Physical Machine is Better than VMS

•  Separate the permanent data-‐storage (DFS) from the virtual storage associated with VM.

“Evaluating mapreduce on virtual machines: The hadoop case”, S Ibrahim, H Jin, L Lu, L Qi, S Wu, X Shi -‐ Cloud Computing, 2009


Multiple VM Deployment •  Deploying More VMs

can improve the performance

•  VMs within the same physical node are competing for the node I/O, causing poor performance.

“Evaluating mapreduce on virtual machines: The hadoop case”, S Ibrahim, H Jin, L Lu, L Qi, S Wu, X Shi -‐ Cloud Computing, 2009


CLOUDLET

“CLOUDLET: towards mapreduce implementation on virtual machines”, S Ibrahim, H Jin, B Cheng, H Cao, S Wu, L Qi. HPDC 2009


Evaluation •  Three-‐Nodes Cluster

Achieved up to 9% on very small cluster (WC)

0

500

1000

1500

2000

2500

512MB 1GB 2GB

1VM

Elap

sed

Tim

e (S

ec) Native

Separate

Achieved up to 50% on very small cluster (Sort)


Meta Disk I/O Scheduler •  Different Disk Schedulers in Xen: »  CFQ, Anticipatory, Deadline, Noop

» Scheduling in two level (VMM, VM)

•  Selecting the appropriate pair schedulers can significantly affect the applications performance

•  MapReduce is used to run different applications

•  A typical MapReduce application consists of different interleaving stages, each requiring different I/O workloads and patterns

Is their an optimal pair schedulers for MapReduce application?


Motivation

Wordcount Wordcount w/o combiner Sort (A, C) à 4.5% (A,N) (D,N) à 6% (A,A) à 9% The disk pairs Schedulers are only sub-‐optimal for different MapReduce applications “Adaptive disk i/o scheduling for mapreduce in virtualized environment”, S Ibrahim, H Jin, L Lu, B He, S Wu. ICPP 2011


Motivation •  A typical Hadoop application consists of different interleaving stages, each

requiring different I/O workloads and patterns.

The pairs schedulers are also sub-‐optimal for different sub-‐phases of the whole job.


Solution-‐ Meta Scheduler •  A novel approach for adaptively tuning the disk scheduler

pairs in both the hypervisor and the virtual machines during the execution of a single MapReduce job. »  First, manually dividing the Hadoop program into phases based on

the characteristics of Hadoop applications. »  If there are phases in an application and a possible pair scheduler, there are unique solutions, where a solution is an assignment of disk pair schedulers to each phase.

»  there is a penalty of the switching overhead

•  Novel heuristic that searches the space of solutions. »  It executes a solution and evaluates the performance score including the switch cost. Based on the evaluation, it selects the next solution to evaluate.

“Adaptive disk i/o scheduling for mapreduce in virtualized environment”, S Ibrahim, H Jin, L Lu, B He, S Wu. ICPP 2011


Evaluation

“Adaptive disk i/o scheduling for mapreduce in virtualized environment”, S Ibrahim, H Jin, L Lu, B He, S Wu. ICPP 2011


Acknowledgments Dhruba Borthakur (Yahoo!)

Vinod Kumar Vavilapalli (Hortonworks)

Apache Hadoop Community


Thank you!

74

Shadi Ibrahim

[email protected]

YARN : Next Generation Hadoop Adapted from a Presentation by Vinod Kumar Vavilapalli(Hortonworks), “Hadoop Yarn”, SF Hadoop Users Meetup


Hadoop MapReduce Classic JobTracker

– Manages cluster resources and job scheduling

TaskTracker

– Per-‐node agent

– Manage tasks

“Hadoop Yarn”, Vinod Kumar Vavilapalli ,SF Hadoop Users Meetup


Current Limitations Scalability – Maximum Cluster size – 4,000 nodes – Maximum concurrent tasks – 40,000 – Coarse synchronization in JobTracker

Single point of failure – Failure kills all queued and running jobs – Jobs need to be re-‐submitted by users

Restart is very tricky due to complex state



Current Limitations -‐2 Hard partition of resources into map and reduce slots – Low resource utilization

Lacks support for alternate paradigms – Iterative applications implemented using MapReduce are 10x slower

– Hacks for the likes of MPI/Graph Processing

Lack of wire-‐compatible protocols – Client and cluster must be of same version – Applications and workflows cannot migrate to different clusters “Hadoop Yarn”, Vinod Kumar Vavilapalli ,SF Hadoop Users Meetup


Requirements Reliability Availability Utilization WireCompatibility Agility & Evolution –Abilityforcustomerstocontrol upgrades to the grid software

stack. Scalability-‐Clusters of 6,000-‐10,000 machines – Each machine with 16 cores, 48G/96G RAM, 24TB/36TB disks – 100,000+ concurrent tasks – 10,000 concurrent jobs



Design Centre Split up the two major functions of JobTracker

– Cluster resource management

– Application life-‐cycle management

MapReduce becomes user-‐land library



Architecture Application – Application is a job submitted to the framework

– Example – Map Reduce Job Container – Basic unit of allocation – Example – container A = 2GB, 1CPU – Replaces the fixed map/reduce slots



Architecture Resource Manager – Global resource scheduler – Hierarchical queues Node Manager – Per-‐machine agent – Manages the life-‐cycle of container – Container resource

monitoring Application Master – Per-‐application – Manages application scheduling and task execution – E.g. MapReduce Application Master



Architecture



Improvements vs classic MapReduce Utilization

– Generic resource model •  Data Locality (node, rack etc.) •  Memory •  CPU •  Disk b/q •  Network b/w

– Remove fixed partition of map and reduce slot

Scalability

– Application life-‐cycle management is very expensive

– Partition resource management and application life-‐cycle management – Application management is distributed

– Hardware trends -‐ Currently run clusters of 4,000 machines •  6,000 2012 machines > 12,000 2009 machines •  <16+ cores, 48/96G, 24TB> v/s <8 cores, 16G, 4TB>



Improvements vsclassic MapReduce Fault Tolerance and Availability – Resource Manager

•  No single point of failure – state saved in ZooKeeper (coming soon) •  Application Masters are restarted automatically on RM restart

– Application Master •  Optional failover via application-‐specific checkpoint •  MapReduce applications pick up where they left off via state saved in HDFS

Wire Compatibility – Protocols are wire-‐compatible – Old clients can talk to new servers – Rolling

upgrades “Hadoop Yarn”, Vinod Kumar Vavilapalli ,SF Hadoop Users Meetup Hadoop – INRIA S.IBRAHIM 85

Improvements vs classic MapReduce Innovation and Agility – MapReduce now becomes a user-‐land library – Multiple versions of MapReduce can run in the same cluster (a la Apache Pig)

•  Faster deployment cycles for improvements

– Customers upgrade MapReduce versions on their schedule

– Users can customize MapReduce “Hadoop Yarn”, Vinod Kumar Vavilapalli ,SF Hadoop Users Meetup


Improvements vs classic MapReduce Support for programming paradigms other than

MapReduce – MPI – Master-‐Worker – Machine Learning – Iterative processing – Enabled by allowing the use of paradigm-‐specific

application master – Run all on the same Hadoop cluster

“Hadoop Yarn”, Vinod Kumar Vavilapalli ,SF Hadoop Users Meetup Hadoop – INRIA S.IBRAHIM 87

Any Performance Gains? Significant gains across the board, 2x in some cases.

• MapReduce – Unlock lots of improvements from Terasort record (Owen/Arun, 2009)

– Shuffle 30%+ – Small Jobs – Uber AM – Re-‐use task slots (containers)

“Hadoop Yarn”, Vinod Kumar Vavilapalli ,SF Hadoop Users Meetup Hadoop – INRIA S.IBRAHIM 88

Thank you!

89

Shadi Ibrahim

[email protected]

Big Data Processing in the Cloud– INRIA S.IBRAHIM

Documents

Big$Data$Processing$using$ Hadoop$ - prace.it4i.czprace.it4i.cz/sites/prace.it4i.cz/files/files/hadoop-10-2015... · Original Hadoop distributed grep Hadoop+BlobSeer sort Execution