33
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Taming the Elephant: Efficient and Effective Apache Hadoop Management Paul Codding 2016 Hadoop Summit Dublin, Ireland

Taming the Elephant: Efficient and Effective Apache Hadoop Management

Embed Size (px)

Citation preview

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Taming the Elephant: Efficient and Effective Apache Hadoop ManagementPaul Codding2016 Hadoop Summit Dublin, Ireland

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Presenters

Paul CoddingSenior Product Manager, Cloud & OperationsApache Ambari, SmartSense

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Agenda

Introduction

Observations & Recommendations

– Observations from analyzing ~1000 customer bundles

– Common operational mistakes

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaIntroduction (Like 2 minutes)

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SmartSense Architecture

L A N D I N G Z O N E

S E R V E RG AT E W AY

A M B A R I

A G E N T A G E N T

A G E N TA G E N TA G E N T

A G E N T

B U N D L E

W O R K E RN O D E

W O R K E RN O D E

W O R K E RN O D E

W O R K E RN O D E

W O R K E RN O D E

W O R K E RN O D E

S m a r t S e n s eA n a l y ti c s

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaIntroduction

Obligatory Poll

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaIntroduction

Obligatory Poll

Observations & Recommendations

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

EVERY node counts…Common difficult to diagnose issues

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operation System Configuration: Locale

/etc/localtime – Dictates which timezone your machine & the JDK thinks it’s in Hive

– unix_timestamp(…) – current_date()

SELECT sum(amount) from saleswhere sale_date > unix_timestamp('2016-03-01 00:00:00')

“default timezone and the default locale”

Inconsistent Locale Configuration

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operating System Configuration: Transparent Huge Pages (THP)

THP is an abstraction layer that automates creating, managing, and using huge pages Pages == memory managed in blocks by the Linux Kernel Huge pages are pages that come in larger sizes 2MB-1GB.

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operating System Configuration: NSCD/SSSD

Name Service Cache Daemon– getpwnam– getpwuid– getgrnam– getgrid– gethostbyname

cp10005.xxxxxx.com:1cp10006.xxxxxx.com:5cp10007.xxxxxx.com:1cp10008.xxxxxx.com:0cp10009.xxxxxx.com:1cp10010.xxxxxx.com:3cp10011.xxxxxx.com:0cp10012.xxxxxx.com:1cp10013.xxxxxx.com:0cp10014.xxxxxx.com:2cp10015.xxxxxx.com:0

cp10005.xxxxxx.com:0cp10006.xxxxxx.com:0cp10007.xxxxxx.com:0cp10008.xxxxxx.com:0cp10009.xxxxxx.com:0cp10010.xxxxxx.com:0cp10011.xxxxxx.com:0cp10012.xxxxxx.com:0cp10013.xxxxxx.com:0cp10014.xxxxxx.com:0cp10015.xxxxxx.com:0

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operating System Configuration: NTPD

Network Time Protocol daemon

2016-03-31 18:40:28,585 FATAL [regionserver/ip-10-0-x-x.ec2.internal/10.0.x.x:16020] regionserver.HRegionServer: Master rejected startup because clock is out of syncorg.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server ip-10-0-x-x.ec2.internal,16020,1459449626477 has been rejected; Reported time is too far out of sync with master. Time difference of 74097ms > max allowed of 30000ms

$ kinit -kt /etc/security/keytabs/hdfs.headless.keytab [email protected]: Clock skew too great while getting initial credentials

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operating System: Legacy Kernel Issues

Specific NIC’s & Kernel Versions– Broadcom bnx2x module prior to RHEL 5.7 (kernel earlier than 2.6.18-274.el5)– QLogic NetXen netxen_nic module prior to RHEL 5.9 (kernel earlier than 2.6.18-348.el5)– Intel 10Gbps ixgbe module prior to RHEL 6.4 (kernel earlier than 2.6.32-358.el6)– Intel 10Gbps ixgbe module from RHEL 5.6 (kernel version 2.6.18-238.el5 and later)

Symptoms– NFS transfers over 10Gbps links are only transferring at 100MiB/sec (i.e. 1Gbps)– TCP connections never reach anywhere near wirespeed– TCP Window size reduced 720 bytesnic.generic-receive-offload

Workaround– nic.large-receive-offload– nic.generic-receive-offload RHEL Knowledgebase Solution: 20278

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

The Core CountsHDFS & YARN

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFSNameNode Configuration

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFS: NameNode Group Mapping Lookup Implementations

org.apache.hadoop.security.ShellBasedUnixGroupsMapping org.apache.hadoop.security.LdapGroupsMapping org.apache.hadoop.security.CompositeGroupsMapping org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback

hadoop.security.group.mapping

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFS: NameNode Metadata Directories

Multiple Entries – Each directory gets a replica of the fsimage data Very common “second directory” is an NFS Mount soft mount vs hard mount

dfs.namenode.name.dir

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFS: NameNode Handler Count

Math.log(${currentDataNodeCount}) * 20

10 node cluster – 46 100 node cluster – 92 1000 node cluster - 138

dfs.namenode.handler.count

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFS: HA Retry Policy

When primary NameNode is killed, clients can retry for up to 10 minutes instead of failing over

dfs.client.retry.policy.enabled = true

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFSDataNode Configuration

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFS: DataNode Failed Volumes

dmesg smartctl

dfs.datanode.failed.volumes.toleratedata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)

=== START OF READ SMART DATA SECTION ===SMART Self-test log structure revision number 1Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error# 1 Short offline Completed: read failure 20% 717

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

HDFS: DataNode

Default: 4096 Increase depends on other services deployed in the cluster and workload type

dfs.datanode.max.transfer.threads

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARNResource Manager Configuration

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN: ResourceManager Min/Max Container Size Allocation

yarn.scheduler.minimum-allocation-mb & yarn.scheduler.maximum-allocation-mb

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN: NodeManager Memory

yarn.nodemanager.resource.memory-mb

RAM

Operating System

DataNode

Region Server

NodeManager

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN: NodeManager Local Directories

yarn.nodemanager.local-dirs

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN ATS: Rolling LevelDB Timeline store

org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore

yarn.timeline-service.store-class

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN ATS: TTL

yarn.timeline-service.ttl-enable & yarn.timeline-service.ttl-ms

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaIntroduction

Obligatory Poll

Observations & Recommendations

Summary

30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SmartSense Recommendations

We’ve covered 16 of ~250 rules Built into Support Case close/Sev1 postmortem process Onramp into core products and Apache Ambari

– Stack Advisor– New Defaults– New Alerts

hbase_tcp_nodelayhdfs_check_point_periodhdfs_dn_suboptimal_mountshdfs_dn_volume_tolerancehdfs_enable_security_checkhdfs_mount_optionshdfs_nn_checkpoint_txnshdfs_nn_handler_counthdfs_nn_protect_imp_dirshdfs_nn_soft_mounthdfs_nn_super_user_grouphdfs_short_circuithive_enable_cbohive_vectorized_execjvm_optsmr_min_split_sizemr_reduce_parallel_copiesmr_slow_startos_cpu_scalingos_ssd_tuningtez_enable_reusetez_session_release_delaytez_shuffle_bufferyarn_ats_securityyarn_nm_black_listed_mount_logdir

31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

All Bundles are:• Encrypted and Anonymized by default

Configurable options to:• Exclude properties within specific Hadoop configuration files• Global REGEX replacements across all configuration, metrics, and logs

By default:• Ambari clear text passwords are not collected• Hive and Oozie database properties are not collected• All IP addresses and host names are anonymized

Bundle Security

32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SmartSense Stack Support

HDP 2.4 HDP 2.3 HDP 2.2 HDP 2.1 HDP 2.0

SmartSense 1.x

Ambari 2.2Built-In!

Ambari 2.1Plug-In

Ambari 2.0Plug-In

Ambari 1.7 Ambari 1.6

SmartSense 1.x

33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Questions?