View
5.186
Download
1
Category
Preview:
Citation preview
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
一、前言.........................................................................................................................................2
二、安裝環境.................................................................................................................................3
三、安裝步驟.................................................................................................................................4
1. 安裝環境說明.................................................................................................................4
2. 設定.................................................................................................................................5
3. 增加三台機器的 ip 和 hostname 的對應...................................................................6
4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入.............................7
5. 安裝 JDK..........................................................................................................................9
6. 關閉防火牆.....................................................................................................................9
7. Hadoop 2.2 安裝.......................................................................................................10
8. Hadoop 2.2 啟動.......................................................................................................16
五、本文的引用網址:..................................................................................................................21
1
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
二、安裝環境
CPU Intel Core i7-4470 3.40GHz
RAM 8 GB * 2
HD 128 SSD + 1TB HD
Network 100M/1000M bps Ethernet
OS Windows 7_64-bit
VM Platform VMware® Workstation 10.0.0 build-1295980
VM Guest OS ubuntu-12.04.3-desktop-amd64
VMRAM 2.0 GB
VM HD 40 GB
3
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
三、安裝步驟1. 安裝環境說明
這裡我們建構一個由三台機器組成的叢集
Hostname User/Password Cluster 角色 OS
cloud001 hduser/adm123 Name nodeSecondary Name nodeResource manager
ubuntu-12.04.3 64 bits
cloud002 hduser/adm123 Data node Node manager
ubuntu-12.04.3 64 bits
cloud003 hduser/adm123 Data node Node manager
ubuntu-12.04.3 64 bits
4
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
2. 設定
(1)修改 hostname , 改成 cloud001
vim /etc/hostname
(2)修改 hduser 權限 :
vim /etc/sudoers
(3)系统升级到最新
sudo apt-get update
sudo apt-get upgrade
基本上先把 cloud001 裝好,再 clone 成 002,003 後,改 hotname 就可以了
5
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
3. 增加三台機器的 ip 和 hostname 的對應
hduser@cloud001:~$ vim /etc/hosts
6
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入
(1) 安裝 SSHsudo apt-get install ssh
(2) 設置 local 無密碼登陸,在登入目錄下執行下面指令建立 .ssh 目錄,進入hduser@ubuntu:~$ mkdir .sshhduser@ubuntu:~$ cd .ssh
產生金鑰(一直 Enter 就可以)hduser@ubuntu:~/.ssh$ ssh-keygen -t rsa
把 id_rsa.pub 追加到授權的 key 裡面去hduser@ubuntu:~/.ssh$ cat id_rsa.pub >> authorized_keys
重啟 SSH 服務hduser@ubuntu:~/.ssh$ service ssh restart
測試
7
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
5. 安裝 JDK
下載 jdk-7u45-linux-x64.tar.gz,copy 到 /usr/lib/jvm, 執行 chmod
hduser@ubuntu:/usr/lib/jvm$ chmod 755 jdk-7u45-linux-x64.gz
安裝hduser@ubuntu:/usr/lib/jvm$ sudo tar zxvf ./jdk-7u45-linux-x64.gz -C /usr/lib/jvm
環境變數hduser@ubuntu:/usr/lib/jvm$ vim ~/.bashrc
最後面增加export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45export JRE_HOME=${JAVA_HOME}/jreexport CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/libexport PATH=${JAVA_HOME}/bin:$PATH
輸入下面的命令來使之生效hduser@ubuntu:/usr/lib/jvm$ source ~/.bashrc
測試hduser@ubuntu:/usr/lib/jvm$ java -versionjava version "1.7.0_45"Java(TM) SE Runtime Environment (build 1.7.0_45-b18)Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)hduser@ubuntu:/usr/lib/jvm$
6. 關閉防火牆
hduser@ubuntu:/usr/lib/jvm$ sudo ufw disableFirewall stopped and disabled on system startuphduser@ubuntu:/usr/lib/jvm$
重啟生效
9
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
7. Hadoop 2.2 安裝
(1) 下載檔案 hadoop-2.2.tar.gz, 解壓到/home/hduser 路径下hduser@ubuntu:~$ chmod 755 hadoop-2.2.0.tar.gzhduser@ubuntu:~$ tar zxvf hadoop-2.2.0.tar.gz
(2) hadoop 配置配置之前,需要在 cloud001 新增以下資料夾/home/hduser/dfs/name
/home/hduser/dfs/data
/home/hduser/temp
修改相關設定擋案內容,清單如下~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
~/hadoop-2.2.0/etc/hadoop/slaves
~/hadoop-2.2.0/etc/hadoop/core-site.xml
~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
~/hadoop-2.2.0/etc/hadoop/mapred-site.xml (不存在,直接 rename mapred-site.xml.temp)
~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
修改 hadoop-env.sh修改 JAVA_HOME 值(export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45)
修改 yarn-env.sh修改 JAVA_HOME 值(exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_45)
修改 slaves (這個文件裡面 KEEP 所有 slave 節點)
寫入以下內容:
cloud002
cloud003
修改 core-site.xml
10
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
<configuration><property>
<name>fs.defaultFS</name>
<value>hdfs://cloud001:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hduser/temp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property></configuration>
修改 hdfs-site.xml<configuration>
11
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>cloud001:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
修改 mapred-site.xml<configuration>
12
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>cloud001:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cloud001:19888</value>
</property>
</configuration>
修改 yarn-site.xml<configuration>
<!-- Site specific YARN configuration properties --><property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
13
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
<property>
<name>yarn.resourcemanager.address</name>
<value>cloud001:8040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cloud001:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cloud001:8025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cloud001:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cloud001:8088</value></property>
</configuration>
設定環境變數hduser@cloud001:~$ vim ~/.bashrc
最後面貼上export HADOOP_HOME=/home/hduser/hadoop-2.2.0export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
14
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
(3)clone image cloud001 to cloud002 & cloud003 , 然後修改 hostname
15
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
8. Hadoop 2.2 啟動
(1) 進入安裝目錄: cd ~/hadoop-2.2.0/,格式化 namenode./bin/hdfs namenode –format
16
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
(2) 啟動 hdfs ./sbin/start-dfs.sh
此時在 001 上面運行的進程有:namenode secondarynamenode002 和 003 上面運行的進程有:datanode
(3) 啟動 yarn ./sbin/start-yarn.sh
此時在 001 上面運行的進程有:namenode secondarynamenoderesourcemanager
002 和 003 上面運行的進程有:datanode nodemanaget
17
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
(4) 查看叢集狀態./bin/hdfs dfsadmin –report
(5) 查看文件組成./bin/hdfs fsck / -files –blocks
19
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
五、本文的引用網址:
1. http://blog.csdn.net/licongcong_0224/article/details/12972889
2. http://blog.csdn.net/focusheart/article/details/14005893 (單機板)
3. http://dawndiy.com/archives/155/ (Linux 下安装配置 JDK7)
4. http://www.ithome.com.tw/itadm/article.php?c=73978&s=1 (Hadoop 簡介)
5. http://www.runpc.com.tw/content/cloud_content.aspx?id=105318 (Hadoop 簡介)
21
Recommended