Upload
andrea-anderson
View
230
Download
0
Embed Size (px)
Citation preview
Apache Sqoop
陳威宇
Sqoop : RDB 與 Hadoop 的橋樑
• Apache Sqoop is a “tool” designed to transfer data between hadoop and structured datastores.
• 從 .. 拿資料– RDBMS– Data warehources– NoSQL
• 寫資料到 ..– Hive– Hbase
• 使用 mapreduce framework to transfer data in parallel
2figure Source : http://bigdataanalyticsnews.com/data-transfer-mysql-cassandra-using-sqoop/
Sqoop 使用方法
3figure Source : http://hive.3du.me/slide.html
Sqoop 與大象的連結 ( setup )
• 解壓縮 http://archive.cloudera.com/cdh5/cdh/5/sqoop-1.4.5-cdh5.3.2.tar.gz
• 修改~/.bashrc
• 修改 conf/sqoop-env.sh
• 啟動 sqoop
export JAVA_HOME=/usr/lib/jvm/java-7-oracleexport HADOOP_HOME=/home/hadoop/hadoopexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HIVE_HOME=/home/hadoop/hiveexport SQOOP_HOME=/home/hadoop/sqoopexport HCAT_HOME=${HIVE_HOME}/hcatalog/ export PATH=$PATH:$SQOOP_HOME/bin:
$ sqoopTry 'sqoop help' for usage.
export HADOOP_COMMON_HOME=/home/hadoop/hadoopexport HBASE_HOME=/home/hadoop/hbaseexport HIVE_HOME=/home/hadoop/hive
練習一 : 實作 import to hive
cd ~ git clone https://github.com/waue0920/hadoop_example.git cd hadoop_example/sqoop/ex1 mysql -u root -phadoop < ./exc1.sql hadoop fs -rmr /user/hadoop/authors sqoop import --connect jdbc:mysql://localhost/books
--username root --table authors --password hadoop --hive-import -m 1
練習 : 用 hive 語法查詢是否已經匯入hive> select * from authors;
練習一 : 製作 job
hadoop fs -rmr /user/hadoop/authors sqoop job --create myjob1 -- import --connect
jdbc:mysql://localhost/books --username root -table authors -P -hive-import -m 1
sqoop job --list sqoop job --show myjob sqoop job --exec myjob
練習 : 用 hive 語法查詢是否已經匯入hive> select * from authors;
練習二 : 實作 export to mysql
cd ~/hadoop_example/sqoop/ex2 mysql -u root -phadoop < ./create.sql ./update_hdfs_data.sh sqoop export --connect jdbc:mysql://localhost/db
--username root --password hadoop --table employee --export-dir /user/hadoop/sqoop_input/emp_data
Reference
• Sqoop 範例說明– http://
www.tutorialspoint.com/sqoop/sqoop_quick_guide.htm
• Sqoop 官方 user guild– https://
sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html
• Sqoop 練習– http://hive.3du.me/