上一篇中搭建了4台linux虚拟机,用这4台虚拟机就可以搭建一个完全分布式hadoop集群了。
1.虚拟机规划
集群中包括4个节点,1个master,3个Salve,节点之间局域网连接,可以相互ping通,节点IP地址分布如下:
机器名称 | IP地址 |
Master | 219.244.84.92 |
Slave1 | 219.244.84.93 |
Slave2 | 219.244.84.94 |
Slave3 | 219.244.84.95 |
四个节点上均是centos6.0系统,并且都建立一个相同的子用户,比如grid(目的是为了免密通信)。
Master机器主要配置NameNode和JobTracker的角色,负责总管分布式数据和分解任务的执行;2个Salve机器配置DataNode和TaskTracker的角色,负责分布式数据存储以及任务的执行。
2.安装oracle公司的jdk
安装方法:
如果有,则跳过这一步。
3.配置主机地址hosts
非常重要,写入集群中所有主机及对应IP
<span style="font-size:14px;"><span style="color:#3366ff;">[root@Master ~]# vi /etc/hosts</span></span><span style="font-size:14px;"># Do not remove the following line, or various programs</span><span style="font-size:14px;"># that require network functionality will fail.</span><span style="font-size:14px;">127.0.0.1 localhost</span><span style="font-size:14px;">::1 localhost6.localdomain6 localhost6</span><span style="font-size:14px;">219.244.87.175 Master</span><span style="font-size:14px;">219.244.87.176 Slave1</span><span style="font-size:14px;">219.244.87.177 Slave2</span><span style="font-size:14px;">219.244.87.178 Slave3</span>
<span style="font-size:14px;"> </span>
<span style="font-size:14px;"> 这一步在root权限下操作,</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">配置</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">完</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">后分别在节点</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">ping</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">其他</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">节点的主机名,必须要是通的。</span>
4.配置SSH互信
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"><span style="font-size:18px;"> </span><span style="font-size: 14px;"> </span><span style="font-size:14px;">前提是</span></span><span style="font-size:14px;"><span style="color: rgb(51, 51, 51);">4</span><span style="color: rgb(51, 51, 51);">台虚拟机都已经启动,且能互相</span><span style="color: rgb(51, 51, 51);">ping</span><span style="color: rgb(51, 51, 51);">通了。</span><span style="color: rgb(51, 51, 51);"> </span></span></p><p><span style="font-size: 14px; color: rgb(51, 51, 51);"> </span><span style="font-size:14px;"><span style="color: rgb(51, 51, 51);">分别使用</span><span style="color: rgb(51, 51, 51);">G</span><span style="color: rgb(51, 51, 51);">rid</span><span style="color: rgb(51, 51, 51);">用户登录每台机器执行生成密钥命令,步骤如下:(每个节点都要执行)</span></span></p><p><span style="font-size: 14px;"> </span></p>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain"><span style="color:#33cc00;">[grid@Master ~]$ ssh-keygen -t rsa</span>
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Generating public/private rsa key pair.
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Enter file in which to save the key (/home/grid/.ssh/id_rsa):
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Created directory '/home/grid/.ssh'.
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Enter passphrase (empty for no passphrase):
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Enter same passphrase again:
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Your identification has been saved in /home/grid/.ssh/id_rsa.
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Your public key has been saved in /home/grid/.ssh/id_rsa.pub.
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">The key fingerprint is:
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">ed:58:a8:00:ea:c1:9a:71:a5:4b:ea:f0:67:9d:39:17 grid@Master
<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;">
</span></p>
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p style="font-family: Arial, Helvetica, sans-serif;"><span style="color:rgb(51, 51, 51);"> 把各个节点</span><span style="color:rgb(51, 51, 51);">上生成的</span><span style="color:rgb(51, 51, 51);">的</span><span style="color:rgb(51, 51, 51);">id_rsa.pub</span><span style="color:rgb(51, 51, 51);">的内容</span><span style="color:rgb(51, 51, 51);">都拷贝到</span><span style="color:rgb(51, 51, 51);">authorized_keys</span><span style="color:rgb(51, 51, 51);">中</span><span style="color:rgb(51, 51, 51);">,</span><span style="color:rgb(51, 51, 51);">然后分发到每台节点中,</span><span style="color:rgb(51, 51, 51);">就可以免密码</span><span style="color:rgb(51, 51, 51);">互连。</span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"> </span></span></p>
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51); background-color: rgb(51, 204, 0);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master ~]$ cd .ssh
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51); background-color: rgb(51, 204, 0);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master .ssh]$ ls -l
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">total 8
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">-rw------- 1 grid grid 1671 Oct 9 22:43 id_rsa
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">-rw-r--r-- 1 grid grid 393 Oct 9 22:43 id_rsa.pub //公钥
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master .ssh]$ cp id_rsa.pub authorized_keys
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master .ssh]$ ssh Datanode1 cat ~/.ssh/id_rsa.pub && ssh Datanode2 cat ~/.ssh/id_rsa.pub //显示出每个节点的公钥
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">The authenticity of host 'datanode1 (10.100.100.179)' can't be established.
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">RSA key fingerprint is c1:b8:84:4d:06:74:50:d9:97:c3:ff:10:ca:26:94:e0.
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Are you sure you want to continue connecting (yes/no)? yes
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Warning: Permanently added 'datanode1,10.100.100.179' (RSA) to the list of known hosts.
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">grid@datanode1's password:
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtZ9eSe3ZjWIcAesLyrXwjhwTnfTC6Fh+49kvCK4UbA6zy4ra4dT4hsu2KfqErIBgBDaEvPxrKnuGkFJpS7X48ums3U2cM54RaQ/ZGjHF+iNDuiu6t5Dn6Etfi03qiqwSFQKm/d2aJu1glK+aNGgYAAaRNrH9usx91PXnn3naqdlKvW9CKNzxlTF84C7pdqI+NOBPxJEtX0XWNdnF22T6RBEwEagv/oHqP3OsozGJXGpQMHT99qPs+R+Zj58VeAVzmuEW9LF/uGl0Vjdoc79uSThgSo3JYbiGJ3fsz/7i2LI4lWrq2azeFKEnBm5n6EvWCMgQNJZ17ANt4qtCWQkgKw== grid@Slave1
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">The authenticity of host 'datanode2 (10.100.100.180)' can't be established.
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">RSA key fingerprint is c1:b8:84:4d:06:74:50:d9:97:c3:ff:10:ca:26:94:e0.
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Are you sure you want to continue connecting (yes/no)? yes
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Warning: Permanently added 'datanode2,10.100.100.180' (RSA) to the list of known hosts.
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">grid@datanode2's password:
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA0J2iEsi94oTZVMM7GMBUmv9Obfz65khrk7rRAfObnsGNEYiqwv5JCFZhkN3I4uL7be65vMd8XXpLCVOrwwY4LYaA8NcGLeEWq0bXXFE6V0xeHh/iBWECsopXkmKUEgX7euccXYH/GhFgQCvJ8WJREPUj3aRwfamPL8+V5Tj1USULY7k1/0lFUQzCs2DxjAfDdf+GN/ikXqjbUC5wkwzmJxxVfUGe1R/H9YKGRbgt7XoZ3AvJ7zJBshtm48sIS2MUWt0C0qJkQSJEu6NgQyFb0HGWUp9AkM8p0aGm4vftoA01xCcSUfJM06j2JHL+kxnEMy3V3g3VzxpVa/ER1eSsow== grid@<span style="font-family: Arial, Helvetica, sans-serif;">Slave2</span>
<span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif;"></span><p></p><p><span style="color: rgb(51, 51, 51);"> 然后把其他各个节点的密钥都拷贝到 </span><span style="color:#009900;">authorized_keys </span><span style="color:#333333;">这个文件中,利用远程复制把</span><span style="color: rgb(0, 153, 0); font-size: 14px; white-space: pre; background-color: rgb(240, 240, 240);">authorized_keys</span><span style="font-size: 14px; white-space: pre; background-color: rgb(240, 240, 240);">复制</span>到<span style="color:#333333;">其他节点中:</span><span style="color: rgb(51, 51, 51);"> </span></p>
[grid@Master .ssh]$ scp authorized_keys grid@Slave1:~/.ssh/
grid@datanode1's password:
authorized_keys 100% 1185 1.2KB/s 00:00
[grid@Master .ssh]$ scp authorized_keys grid@Slave2:~/.ssh/
grid@datanode2's password:
authorized_keys 100% 1185 1.2KB/s 00:00[grid@Master .ssh]$ ssh Slave1 //ssh登陆测试
[grid@Datanode1 ~]$ ssh Slave2
确认每个节点都能免密码 ssh 登录到其他节点,这样即完成了SSH互信设置
5.配置Hadoop集群
将hadoop安装文件放到其中一个虚拟机桌面上,然后按如下步骤配置,配置完后将配置好的整个hadoop文件夹远程复制到其他节点上,这样整个hadoop集群就配置完了。
5.1 解压
[root@Master ~]# mv hadoop-0.20.2.tar.gz /home/grid [root@Master ~]# su - grid [grid@Master ~]$ ls hadoop-0.20.2.tar.gz [grid@Master ~]$ tar zxvf hadoop-0.20.2.tar.gz //安装 [grid@Master ~]$ ls hadoop-0.20.2 hadoop-0.20.2.tar.gz [grid@Master ~]$ cd hadoop-0.20.2
5.2 配置hadoop 红色字体为添加的内容
[grid@Master ~]$ cd hadoop-0.20.2 [grid@Master hadoop-0.20.2]$ cd conf [grid@Master conf]$ vi core-site.xml configuration> <property> <name>fs.default.name</name> <value>hdfs://Master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/grid/hadoop/tmp</value> //手动创建该目录 </property> </configuration><pre name="code" class="sql">mapred-site.xml[grid@Master conf]$ vi mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>http://Master:9001</value> </property> </configuration>
hdfs-site.xml [grid@Master conf]$ vi hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>3</value> //文件复制的副本数 </property> </configuration>
mapred-site.xml [grid@Master conf]$ vi mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>Master:9001</value> </property> </configuration>
hadoop-env.sh[grid@Master conf]$ vi hadoop-env.sh # The java implementation to use. Required. export JAVA_HOME=/usr/java/jdk1.7.0_55
masters[grid@Master conf]$ vi masters Master
slaves[grid@Master conf]$ vi slaves Slave1 Slave2 Slave3
5.3 向各节点复制hadoop目录
<pre name="code" class="plain" style="font-size: 14px;">[grid@Master ~]$ scp -r hadoop-0.20.2 Slave1:~/ SerialUtils.hh 100% 4525 4.4KB/s 00:00 StringUtils.hh 100% 2441 2.4KB/s 00:00 [grid@Master ~]$ scp -r hadoop-0.20.2 Slave2:~/ [grid@Master ~]$ scp -r hadoop-0.20.2 Slave3:~/<pre name="code" class="plain" style="font-size: 14px; color: rgb(51, 51, 51);">将整个hadoop集群的配置完成了,然后就可以启动hadoop集群了。
6.启动hadoop集群
在第一次启动hadoop集群前要先格式化分布式文件系统,以后就不需要了。<pre name="code" class="html">[grid@Master ~]$ cd hadoop-0.20.2 [grid@Master hadoop-0.20.2]$ bin/hadoop namenode -format格式化成功后启动:<pre name="code" class="html">[grid@Master hadoop-0.20.2]$ bin/start-all.sh
检测守护进程启动情况:
Master节点上:<pre name="code" class="html">[grid@Master conf]$ jps 19648 Jps 5736 NameNode 5952 JobTracker 5888 SecondaryNameNodeSlave1节点上:
<pre name="code" class="html">[grid@Slave1 ~]$ jps 10732 Jps 5811 TaskTracker 5716 DataNode
至此,hadoop完整分布式模式安装完成。如果有错误,请多看日志文件。
<span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif;"></span><p></p><p><span style="color:#333333;">
</span></p>