Grid Operations
©2013 LinkedIn Corporation. All Rights Reserved.
Hadoop Operations at LinkedInAllen WittenauerGrid Computing Architect
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
“Hadoop is not a developer problem; it’s an operations problem.”
-- Hadoop vendor ex-employee
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ August 2009– 20 Nodes in 1 grid– Apache Hadoop 0.20.0– No configuration management– No monitoring– No security– Free for all, including random mafia hits on running jobs– FIFO Scheduling– ~20 users– 20 tasks per node– Solaris
– No operational support
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
How We Fixed This(In Chronological Order)
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Year One
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ Dropped task count– 10 mappers => 7 mappers– 10 reducers => 5 reducers
§ Reworked ETL– hourlies => dailies– Re-ordered to take advantage of compression§ 10x storage improvement
– Sample impact on one job (not workflow!):§ 80,000 map tasks => 2,000 map tasks§ Run time cut in half
§ Optimize work flows/culture shift§ More task time, less tasks§ Production review to reinforce good behavio(u)r
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ Switched to Capacity Scheduler– FIFO is terrible– Fair Share only viable for small tasks– Enforced SLAs via custom patch
§ Submitted Jar Size Limit– Encourage distributed cache usage– Enforced limit via custom patch
15% Fast Queue:- Task Time < 15 Minutes- Job Time < 1 Hour- Slot stealing from "Slow" Queue
80% Slow Queue:- Job Time < 24 Hours- Up to 80% of slots
5% ETL Tasks
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ Benchmarking– Use production code not TeraSort!
§ Cut cost per unit in half§ 2x nodes per rack§ Extra RAM
– buffering– bus speed
Old Node:- 2 Rack Units- 2 CPUs- 16 GB- 8 x 1 TB SATA- 1 x 2 gb NIC
New Node:- 1 Rack Unit- 2 CPUs- 24 or 32 GB- 6 x 2 TB SATA- 1 x 1 gb NIC
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Year Two
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ DataNode disk partitioning– Separate file systems for different purposes
– Mount options: noatime, commit=30, data=writeback
§ NN, JT, etc– No “special hardware” == use SW RAID
20 GB/, ...
200 GBMR HDFS
5GBSwap
200 GBMR HDFS
...
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
LDAP Master+
KDC Master
LDAP Master+
KDC
MultiMaster
Replication
LDAP/KDC Slaves
Client Nodenscd
username, uidgroup name, gid
netgroup, sudoers
LDAP/KDC Slaves
Client Nodenscd
username, uidgroup name, gid
netgroup, sudoers
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ Service Bundle– RPMs, config files, etc– Conflict resolution
Host bcfg2 ServerGroup1,Group2,
... Group1 -> Svc1, Svc2, ...Group2 -> Svc1, Svc3, ...Group3 -> Svc4, Svc5, ...Svc1+
Svc2+Svc3
Content
bcfg2client
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ Different RPM names + different install locations = pre-deploy-ability:
Object RPM Name File Path
Hadoop 1.0.4-p3 Binaries hadoop-1043-bin-1.0.4-3 /dir/hadoop-1.0.4-p3
Grid Config for 1.0.4-p3 gridname-1043-hadoopconf-1.0.4.3-1
/dir/grid-conf-1.0.4-p3
Hadoop 1.1.2-p1 Binaries hadoop-1121-bin-1.1.2.1-1 /dir/hadoop-1.1.2-p1
Grid Config for 1.1.2-p1 gridname-1043-hadoopconf-1.0.4.3-1
/dir/grid-conf-1.1.2-p1
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Year Three+
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Corp ITActive Directory
@CORP
Grid Realm@GRID
krbtgt/GRID@CORP
Hadoop Services
krbtgt/user@CORPkrbtgt/GRID@CORP
krbtgt/host@GRIDkrbtgt/service@GRID
Password
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Many months moving to secure Apache Hadoop...
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ March 2013– 5000 Nodes in ~10 grids– Apache Hadoop 1.0.4 + custom patches– Full configuration management– Full monitoring– Security– Capacity scheduler with SLA– ~700 users– 12 tasks per node– Linux
– Five dedicated operations staff members
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Future Work
Wednesday, March 20, 13
©2013 LinkedIn Corporation. All Rights Reserved.
Is ‘pure Hadoop’ the right tool for all of our workloads?
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
CEPH
HDFS
YARN PBS
Wednesday, March 20, 13
BUSINESS OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
Wednesday, March 20, 13
GRID OPERATIONS ©2013 LinkedIn Corporation. All Rights Reserved.
§ More on LinkedIn Hadoop Performance: – http://www.slideshare.net/allenwittenauer/2012-lihadoopperf
§ LinkedIn Data Analytics:– http://data.linkedin.com/
Wednesday, March 20, 13