Webdb2011 hadoop

  • View
    1.984

  • Download
    0

Embed Size (px)

Transcript

  • 1. Ameba Hadoop - WebDB Forum 2011 - Ameba Technology Laboratory

2. 2 Ameba Technology Laboratory4 Ameba Pigg Patriot Ameba Technology Laboratory Twitter toutou ID id:ICHIRO 3. 4. 4 289.0 PV Ameba 1823 ID 935 ID 20119 5. 5 6. 6 7. 7 918 8. Ameba Technology Laboratory 9. 9 20114 1110 8 10. 10 20104 2008 20102 20114 Ameba Technology Laboratory 11. 11 profile () pigg mkt Ameba 12. Hadoop 13. 13 HDFS 0.13.1 0.20.1 pico Amazon EMR Pig Hadoop 14. Patriot 15. 15 HDFS Hive DSL Map/Reduce Hive WebUICIC HUE 16. 16 NameNode SecondaryNN NN JobTracker 23232323 DataNode, TaskTracker 10101010 DataNode, TaskTracker BA Batch MySQL HinemosWeb,API,Ganglia, Nagios,Hudson, Puppet MySQL, HUE Hadoop Ameba MySQL Hive Hive Adhoc HUE CDH3u0 17. Hive HUE 18. 18 Apache Facebook SQLHiveQL Hive Cloudera HadoopWeb UI Hive HUE 19. 19 Hive Hive Driver Compiler, Optimizer, Executor Thrift Server Web Interface JDBC ODBC Metastore CLI HiveQL 20. 20 HiveQL CREATE TABLE pigg_login ( time STRING, ameba_id STRING, ip STRING) partitioned by(dt STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY t STORED AS SEQUENCEFILE; SELECT p.age, count(distinct l.ameba_id) FROM pigg_login l JOIN profile p on (l.ameba_id=p.ameba_id) WHERE l.dt= 2011-11-04 GROUP BY p.age; SQL Map/Reduce 21. 21 HUE 22. 22 Patriot WebUI 23. Flume HBase Stinger 24. 24 Flume Cloudera Agent Agent Agent Agent Collector Collector Master HDFS 25. 25 Flume HDFS Logical Node SourceSink Agent Tail Source SinkLogFile Processor Source Sink Matcher Collector Source HDFS Sink HDFS 26. 26 HBase Google BigTable HDFSRead/Write Facebook Messages Table RowKey ColumnFamily Column Timestamp SortedMap>> Table RowKey Row Column 27. 27 Zookeeper Client HMaster Region Server Region Server Region Server Read/Write Region 28. 28 Hadoop Map/Reduce incrementColumnValuecheckAndPut 29. 29 Stinger node + soket.io HBase pollingpush flume agent log flume collector increment flume master websocket IDPV PV 30. HBase Hornet 31. 31 MySQL etc HBase 32. 32 () NameNameNameName FukudaFukudaFukudaFukuda Age 28 NameNameNameName SuzukiSuzukiSuzukiSuzuki Age 28 NameNameNameName YasudaYasudaYasudaYasuda Age 26 datedatedatedate 2011/01/022011/01/022011/01/022011/01/02 Follow Friend Friend 33. 33 HDFS Client Gateway HBase Gateway Gateway Hornet RPC 34. 34 RowKey Row Key Column Key Value 1,FOLLOW,OUTGOING g:2 date=2011/1/12 g:3 date=2011/4/20 2,FOLLOW,INCOMING g:1 date=2011/1/12 2,FOLLOW,OUTGOING g:3 date=2011/2/9 3,FOLLOW,INCOMING g:1 date=2011/4/20 g:2 date=2011/2/9 datedatedatedate 2011/01/122011/01/122011/01/122011/01/12 Follow Follow Follow 35. 36. 36 Hadoop HiveHBaseZookeeper Flume Flume NG Hadoop0.23 Map/ReduceBSPMPI NameNodeJobTracker HadoopCDH Hadoop 37. 37 akb-lab@cyberagent.co.jp fukuda_ichiro@cyberagent.co.jp 14 38.