CDHの歴史とCDH5新機能概要 #at_tokuben
Preview:
DESCRIPTION
@特勉(@IT 特集連動勉強会) で発表させていただきました、CDH5についての資料です。 http://atnd.org/events/46924
Citation preview
- 1. CDH5 2014/01/23 Cloudera 1
- 2. ( ) 20114Cloudera 2
- 3. Cloudera Impala PDF Cloudera John Russell HadoopHBaseHadoop
Hive Cloudera Cloudera World Tokyo 3
- 4. CDH CDH5 4 HDFS YARN MapReduce Cloudera Impala Cloudera
Search Spark
- 5. CDH 5
- 6. Apache Hadoop + 6
- 7. HDFS 1 2 3 4 5 HDFS 2 1 1 2 1 4 2 3 3 3 5 5 4 5 4 HDFS
7
- 8. HDFS 1 3 1 1 2 1 4 2 3 3 3 5 5 4 5 4 1 8 2 3 4
- 9. MapReduce 1 2 3 4 5 MR 2 1 1 2 1 4 2 3 3 3 5 5 4 5 4 9
- 10. CDH Clouderas DistribuLon including Apache Hadoop 100%
10
- 11. CDH MapReduce Cloudera Impala Cloudera Search etc MAPREDUC
E, HIVE, PIG SQL CLOUDERA IMPALA CLOUDERA SEARCH 11 MAHOUT,
DATAFU
- 12. CDH 2013 YARN HDFSNFS Impala, Search, Spark, etc Q3 2009
2009 Q2 2011 2010 Q1 2010 12 2011 2012 Q2 2012 2013
- 13. CDH2 (2010) 13 HadoopHivePig
- 14. CDH3 (20114) Hadoop HBase Flume RDBMS Sqoop 14
- 15. CDH4 (20126) HDFS (HA) MapReduceHA Mahout HBase, Flume, Hue
15
- 16. CDH5 YARN HDFSNFS : Impala, Search(Solr)Sentry, Accumulo,
Spark 16
- 17. CDH5 17
- 18. HDFS Hadoop SPOF! CDH4 18 mmap HDFS REST API NFSv3 CDH4
CDH5 CDH4 CDH5 CDH5
- 19. CDH5 HDFS /path/to/dir/.snapshot : Cloudera Manager GUI
19
- 20. HBase HDFS HDFS CDH4 CDH4 CDH4 CDH5 HBase CDH5 20
- 21. HBase CDH CDH3 CDH4 CDH5 : : 21
- 22. YARN CDH5 Yet-Another-Resource-Negotiator JobTracker
MapReduceYARN ImpalaSparkYARN 22
- 23. MapReduce 1.0 Job Client Submit Job JobTracker TaskTracker
TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker
TaskTracker TaskTracker Map Slot Reduce Slot 23
- 24. CDH5 YARN Client ResourceManager Submit Application Client
NodeManager NodeManager NodeManager AppMaster Cotainer Container
Container Container Cotainer Container Container AppMaster 24
NodeManager Container
- 25. CDH5 MapReduce Cloudera Impala Spark Cloudera Search
25
- 26. MapReduce MapReduce 2.0 (MRv2) 26 CDH5 YARN
ResourceManager(RM) + ApplicationMaster(AsM) JobTracker
NodeManager(AM) TaskTracker MRv1
- 27. Cloudera Impala SQL HiveQL Hive MapReduce HDFS HBase x1030
x23 CDH5CDH CDH5 Llama () 27 ImpalaYARN
- 28. Cloudera Search CDH Apache Solr CDH5CDH HDFS 29 MapReduce
Flume( NRT) HBase
- 29. 1: 30
- 30. 2: Twicer 31
- 31. CDH5 Spark Scala ScalaJavaPython val le =
sc.textFile("hdfs://.../pagecounts-*.gz") val counts =
le.atMap(line => line.split(" ")) .map(word => (word, 1))
.reduceByKey(_ + _) counts.saveAsTextFile("hdfs://.../word-count")
Scala 32
- 32. 33
- 33. Hadoop 34
- 34. Hadoop API API BI + JDBC/ODBC Web SQL Hadoop RDBMS 35
DWH
- 35. CDH5 HDFS MapReduce CDH5 hcp://Lny.cloudera.com/cdh5doc
36
- 36. Cloudera Manager 5 37 CDH YARNGUI Standard
- 37. Cloudera Manager Hadoop 1001 YARN Cloudera Manager + CDH5
hcp://cloudera.com/content/support/en/downloads.html 38
- 38. CDHML CDH () cdh-user-jp@cloudera.org CDH Cloudera
hcp://www.cloudera.co.jp/newslecer Cloudera CDH/CM 39
- 39. We are Hiring! Cloudera Hadoop Hadoop ()
info-jp@cloudera.com 40
- 40. 41