上一篇中搭建了4台linux虚拟机,用这4台虚拟机就可以搭建一个完全分布式hadoop集群了。

1.虚拟机规划

集群中包括4个节点,1个master,3个Salve,节点之间局域网连接,可以相互ping通,节点IP地址分布如下:      

机器名称

IP地址

Master

219.244.84.92

Slave1

219.244.84.93

Slave2

219.244.84.94

Slave3

219.244.84.95         

四个节点上均是centos6.0系统,并且都建立一个相同的子用户,比如grid(目的是为了免密通信)。

          Master机器主要配置NameNode和JobTracker的角色,负责总管分布式数据和分解任务的执行;2个Salve机器配置DataNode和TaskTracker的角色,负责分布式数据存储以及任务的执行。

2.安装oracle公司的jdk

安装方法:

      如果有,则跳过这一步。

3.配置主机地址hosts    

      非常重要,写入集群中所有主机及对应IP 


<span style="font-size:14px;"><span style="color:#3366ff;">[root@Master ~]# vi /etc/hosts</span></span><span style="font-size:14px;"># Do not remove the following line, or various programs</span><span style="font-size:14px;"># that require network functionality will fail.</span><span style="font-size:14px;">127.0.0.1 localhost</span><span style="font-size:14px;">::1 localhost6.localdomain6 localhost6</span><span style="font-size:14px;">219.244.87.175 Master</span><span style="font-size:14px;">219.244.87.176 Slave1</span><span style="font-size:14px;">219.244.87.177 Slave2</span><span style="font-size:14px;">219.244.87.178 Slave3</span>

<span style="font-size:14px;">   </span>
<span style="font-size:14px;">   这一步在root权限下操作,</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">配置</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">完</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">后分别在节点</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">ping</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">其他</span><span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);">节点的主机名,必须要是通的。</span>

4.配置SSH互信

<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"><span style="font-size:18px;">   </span><span style="font-size: 14px;">  </span><span style="font-size:14px;">前提是</span></span><span style="font-size:14px;"><span style="color: rgb(51, 51, 51);">4</span><span style="color: rgb(51, 51, 51);">台虚拟机都已经启动,且能互相</span><span style="color: rgb(51, 51, 51);">ping</span><span style="color: rgb(51, 51, 51);">通了。</span><span style="color: rgb(51, 51, 51);"> </span></span></p><p><span style="font-size: 14px; color: rgb(51, 51, 51);">     </span><span style="font-size:14px;"><span style="color: rgb(51, 51, 51);">分别使用</span><span style="color: rgb(51, 51, 51);">G</span><span style="color: rgb(51, 51, 51);">rid</span><span style="color: rgb(51, 51, 51);">用户登录每台机器执行生成密钥命令,步骤如下:(每个节点都要执行)</span></span></p><p><span style="font-size: 14px;">          </span></p>

<span style="font-family: Arial, Helvetica, sans-serif;"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain"><span style="color:#33cc00;">[grid@Master ~]$ ssh-keygen -t rsa</span>



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Generating public/private rsa key pair.



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Enter file in which to save the key (/home/grid/.ssh/id_rsa):



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Created directory '/home/grid/.ssh'.



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Enter passphrase (empty for no passphrase):



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Enter same passphrase again:



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Your identification has been saved in /home/grid/.ssh/id_rsa.



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">Your public key has been saved in /home/grid/.ssh/id_rsa.pub.



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">The key fingerprint is:



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;"></span><pre name="code" class="plain">ed:58:a8:00:ea:c1:9a:71:a5:4b:ea:f0:67:9d:39:17 grid@Master



<span style="font-family: Arial, Helvetica, sans-serif; color: rgb(51, 51, 51);"></span><p><span style="font-size: 14px;">
</span></p>
<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p style="font-family: Arial, Helvetica, sans-serif;"><span style="color:rgb(51, 51, 51);">      把各个节点</span><span style="color:rgb(51, 51, 51);">上生成的</span><span style="color:rgb(51, 51, 51);">的</span><span style="color:rgb(51, 51, 51);">id_rsa.pub</span><span style="color:rgb(51, 51, 51);">的内容</span><span style="color:rgb(51, 51, 51);">都拷贝到</span><span style="color:rgb(51, 51, 51);">authorized_keys</span><span style="color:rgb(51, 51, 51);">中</span><span style="color:rgb(51, 51, 51);">,</span><span style="color:rgb(51, 51, 51);">然后分发到每台节点中,</span><span style="color:rgb(51, 51, 51);">就可以免密码</span><span style="color:rgb(51, 51, 51);">互连。</span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;">    </span></span></p>

<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51); background-color: rgb(51, 204, 0);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master ~]$ cd .ssh



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51); background-color: rgb(51, 204, 0);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master .ssh]$ ls -l



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">total 8



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">-rw------- 1 grid grid 1671 Oct 9 22:43 id_rsa



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">-rw-r--r-- 1 grid grid 393 Oct 9 22:43 id_rsa.pub //公钥



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master .ssh]$ cp id_rsa.pub authorized_keys



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">[grid@Master .ssh]$ ssh Datanode1 cat ~/.ssh/id_rsa.pub && ssh Datanode2 cat ~/.ssh/id_rsa.pub //显示出每个节点的公钥



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">The authenticity of host 'datanode1 (10.100.100.179)' can't be established.



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">RSA key fingerprint is c1:b8:84:4d:06:74:50:d9:97:c3:ff:10:ca:26:94:e0.



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Are you sure you want to continue connecting (yes/no)? yes



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Warning: Permanently added 'datanode1,10.100.100.179' (RSA) to the list of known hosts.



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">grid@datanode1's password:



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtZ9eSe3ZjWIcAesLyrXwjhwTnfTC6Fh+49kvCK4UbA6zy4ra4dT4hsu2KfqErIBgBDaEvPxrKnuGkFJpS7X48ums3U2cM54RaQ/ZGjHF+iNDuiu6t5Dn6Etfi03qiqwSFQKm/d2aJu1glK+aNGgYAAaRNrH9usx91PXnn3naqdlKvW9CKNzxlTF84C7pdqI+NOBPxJEtX0XWNdnF22T6RBEwEagv/oHqP3OsozGJXGpQMHT99qPs+R+Zj58VeAVzmuEW9LF/uGl0Vjdoc79uSThgSo3JYbiGJ3fsz/7i2LI4lWrq2azeFKEnBm5n6EvWCMgQNJZ17ANt4qtCWQkgKw== grid@Slave1



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">The authenticity of host 'datanode2 (10.100.100.180)' can't be established.



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">RSA key fingerprint is c1:b8:84:4d:06:74:50:d9:97:c3:ff:10:ca:26:94:e0.



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Are you sure you want to continue connecting (yes/no)? yes



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Warning: Permanently added 'datanode2,10.100.100.180' (RSA) to the list of known hosts.



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">grid@datanode2's password:



<span style="font-size: 14px; color: rgb(51, 51, 51);"></span><p><span style="color: rgb(51, 51, 51);"></span></p><p><span style="color: rgb(51, 51, 51);"><span style="font-family:Arial, Helvetica, sans-serif;"></span></span><pre name="code" class="plain">Ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA0J2iEsi94oTZVMM7GMBUmv9Obfz65khrk7rRAfObnsGNEYiqwv5JCFZhkN3I4uL7be65vMd8XXpLCVOrwwY4LYaA8NcGLeEWq0bXXFE6V0xeHh/iBWECsopXkmKUEgX7euccXYH/GhFgQCvJ8WJREPUj3aRwfamPL8+V5Tj1USULY7k1/0lFUQzCs2DxjAfDdf+GN/ikXqjbUC5wkwzmJxxVfUGe1R/H9YKGRbgt7XoZ3AvJ7zJBshtm48sIS2MUWt0C0qJkQSJEu6NgQyFb0HGWUp9AkM8p0aGm4vftoA01xCcSUfJM06j2JHL+kxnEMy3V3g3VzxpVa/ER1eSsow== grid@<span style="font-family: Arial, Helvetica, sans-serif;">Slave2</span>



<span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif;"></span><p></p><p><span style="color: rgb(51, 51, 51);">       然后把其他各个节点的密钥都拷贝到 </span><span style="color:#009900;">authorized_keys </span><span style="color:#333333;">这个文件中,利用远程复制把</span><span style="color: rgb(0, 153, 0); font-size: 14px; white-space: pre; background-color: rgb(240, 240, 240);">authorized_keys</span><span style="font-size: 14px; white-space: pre; background-color: rgb(240, 240, 240);">复制</span>到<span style="color:#333333;">其他节点中:</span><span style="color: rgb(51, 51, 51);">       </span></p>

[grid@Master .ssh]$ scp authorized_keys grid@Slave1:~/.ssh/
grid@datanode1's password:
authorized_keys 100% 1185 1.2KB/s 00:00   
[grid@Master .ssh]$ scp authorized_keys grid@Slave2:~/.ssh/
grid@datanode2's password:
authorized_keys 100% 1185 1.2KB/s 00:00   

[grid@Master .ssh]$ ssh Slave1 //ssh登陆测试

[grid@Datanode1 ~]$ ssh Slave2


确认每个节点都能免密码 ssh 登录到其他节点,这样即完成了SSH互信设置

5.配置Hadoop集群


 将hadoop安装文件放到其中一个虚拟机桌面上,然后按如下步骤配置,配置完后将配置好的整个hadoop文件夹远程复制到其他节点上,这样整个hadoop集群就配置完了。

5.1 解压


[root@Master ~]# mv hadoop-0.20.2.tar.gz /home/grid [root@Master ~]# su - grid [grid@Master ~]$ ls hadoop-0.20.2.tar.gz [grid@Master ~]$ tar zxvf hadoop-0.20.2.tar.gz //安装 [grid@Master ~]$ ls hadoop-0.20.2 hadoop-0.20.2.tar.gz [grid@Master ~]$ cd hadoop-0.20.2
5.2 配置hadoop   红色字体为添加的内容



[grid@Master ~]$ cd hadoop-0.20.2 [grid@Master hadoop-0.20.2]$ cd conf [grid@Master conf]$ vi core-site.xml configuration> <property> <name>fs.default.name</name> <value>hdfs://Master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/grid/hadoop/tmp</value> //手动创建该目录 </property> </configuration><pre name="code" class="sql">mapred-site.xml[grid@Master conf]$ vi mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>http://Master:9001</value> </property> </configuration>
hdfs-site.xml [grid@Master conf]$ vi hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>3</value> //文件复制的副本数 </property> </configuration>
mapred-site.xml [grid@Master conf]$ vi mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>Master:9001</value> </property> </configuration>
hadoop-env.sh[grid@Master conf]$ vi hadoop-env.sh # The java implementation to use. Required. export JAVA_HOME=/usr/java/jdk1.7.0_55
masters[grid@Master conf]$ vi masters Master
slaves[grid@Master conf]$ vi slaves Slave1 Slave2 Slave3
5.3 向各节点复制hadoop目录






<pre name="code" class="plain" style="font-size: 14px;">[grid@Master ~]$ scp -r hadoop-0.20.2 Slave1:~/ SerialUtils.hh 100% 4525 4.4KB/s 00:00 StringUtils.hh 100% 2441 2.4KB/s 00:00 [grid@Master ~]$ scp -r hadoop-0.20.2 Slave2:~/ [grid@Master ~]$ scp -r hadoop-0.20.2 Slave3:~/<pre name="code" class="plain" style="font-size: 14px; color: rgb(51, 51, 51);">将整个hadoop集群的配置完成了,然后就可以启动hadoop集群了。





6.启动hadoop集群



在第一次启动hadoop集群前要先格式化分布式文件系统,以后就不需要了。<pre name="code" class="html">[grid@Master ~]$ cd hadoop-0.20.2 [grid@Master hadoop-0.20.2]$ bin/hadoop namenode -format格式化成功后启动:<pre name="code" class="html">[grid@Master hadoop-0.20.2]$ bin/start-all.sh
检测守护进程启动情况:



Master节点上:<pre name="code" class="html">[grid@Master conf]$ jps 19648 Jps 5736 NameNode 5952 JobTracker 5888 SecondaryNameNodeSlave1节点上:


<pre name="code" class="html">[grid@Slave1 ~]$ jps 10732 Jps 5811 TaskTracker 5716 DataNode

至此,hadoop完整分布式模式安装完成。如果有错误,请多看日志文件。








<span style="font-size: 14px; font-family: Arial, Helvetica, sans-serif;"></span><p></p><p><span style="color:#333333;">
</span></p>