系统准备:所需系统:64位CentOS系统 一、软件准备 二、服务器环境准备(每台服务器都执行) 1、每台服务器都同步时间 2、每台服务器都关闭iptables、关闭selinux 3、每台服务器都修改/etc/hosts文件 4、每台服务器都配置java环境(这里用的是jdk-1.8.0_45,需jdk7以上) 5、每台服务器都配置Hadoop环境(这里使用的是Hadoop2.7.1,先不用安装Hadoop) 三、配置NameNode到其他节点免密登陆 四、配置Hadoop(在NameNode服务器上操作) 1、安装Hadoop 2、配置Hadoop 五、复制Hadoop到其他节点 六、格式化(NameNode服务器上执行) 七、启动Hadoop(NameNode服务器上执行) 八、验证
一、软件准备
a)hadoop-2.7.1_3jia5.tar.gz
b)jdk-8u45-linux-x64.rpm
二、服务器准备
a)准备4台服务器
角色 IP 主机名
NameNode 10.0.2.11 hdfs1
Datenade 10.0.2.12 hdfs2
Datenade 10.0.2.13 hdfs3
Datenade 10.0.2.14 hdfs4
三、服务器环境准备(每台服务器都执行)
1、每台服务器都同步时间
/usr/sbin/ntpdate pool.ntp.org
2、每台机器都关闭iptables、关闭selinux
service iptables stop
chkconfig iptables off
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
setenforce 0
3、每台服务器都修改/etc/hosts文件 10.0.2.11 hdfs1 10.0.2.12 hdfs2 10.0.2.13 hdfs3 10.0.2.14 hdfs4 4、每台服务器都配置java环境 #rpm -ivh jdk-8u45-linux-x64.rpm #vim /etc/profile.d/java.sh JAVA_HOME=/usr/java/jdk1.8.0_45 PATH=$JAVA_HOME/bin:$PATH export JAVA_HOME PATH 执行: source /etc/profile.d/java.sh 执行:java -version 查看java版本是否已经变成最新的。 5、每台服务器都配置Hadoop环境(这里使用的是Hadoop2.7.1,先不用安装Hadoop。) vim /etc/profile.d/hadoop.sh HADOOP_HOME=/usr/local/hadoop PATH=$HADOOP_HOME/bin:$PATH PATH=$HADOOP_HOME/sbin:$PATH export HADOOP_HOME PATH 执行: source /etc/profile.d/hadoop.sh 四、配置NameNode到其他节点ssh免密登陆 #ssh-keygen -t rsa (一路回车,不要输密码) #cd /root/.ssh 私钥文件:id_raa 公钥文件:id_rsa.pub 将四台虚拟机里(包含它自己的)的公钥文件id_rsa.pub内容放到authorized_keys(一定要有四个)文件中: [root@hdfs1 ~]# cat .ssh/authorized_keys ##Begin hdfs1 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArCBSUy5B7dO8T1ay34z6Nu6QUF5d3pT15EJwhdsJAIZjzuDDHq7oPwKVXSJySQYd+A6g9T1kC66G42ymqchJE8xp2W9tE3NIQKRS/hc+X5YgY4gcPm7oMym1o/k8tqs3uRNAlVhSDEgQnu0Rl4ZkwFOWRoPTTyIweXcYuG2j/UkbyDfNSEspvs3fCoVy0zeqj6FeAkzpX2WDZQ== root@hdfs1 ##end hdfs1
##Begin hdfs2 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA5B6ZIu2of83lNGQYmuV8C/1x0/8+cvzsZwqqmKlwXU5IbNUD2PrHju10wI9SPgBDdHvCHGiVWwGeaOt17GsTvGx1NKl5rgfBm98leGPgVhinkiH+uzhuR7Q/r9Y/Si/uqrzpPz+he/MVFYzGsfdckdMrOgu2+KldoWtDHQwPz+mBxe5J8ktHaOY5gDZFhUasne3aVf+Z+bNcEw== root@hdfs2 ##end hdfs2
##Begin hdfs3 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtaPRIgqZG3bi7UTsialWUhtqXxIbau68iGHMqm8wi7Q1ZKuJKXjoJAwXynZ9LY+sV38wxw8bm7YSd6Xgq964KTF2wriL8P1rpGrlAv5N0We4XYp5hp+T6HZeyzWicHXh/t+x/5wEOwFinNEvPkUv4nAboNxyrTWXZwpuSstZogqvWGwySH0HfeliiOwXQWi6Pl0AI8uZlIudVw== root@hdfs3 ##end hdfs3
##Begin hdfs4 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAuTJgx8m2Sm35ef3BESyZhMG5hdU+g+bs17BK77Lvw4HDy2vYRMtXxQAcraWrU+xvYJIU4yTjM+g3dUpjYVaSUbMyfqxjO0H2x3reUlp5Jgi0h8K1OwJ6nUsUB93TT6yGVncHLYzrp1qkHyxYFEVmlvDO7Av1fT1YG/rQ6P7zPYnDi22FFl8PJ14N6FmV3hpr6mz7hZsQy51qzw== root@hdfs4 ##end hdfs4
测试SSH之间是否互相通,无密码登陆
五、配置Hadoop(在NameNode服务器上进行操作) 1、安装hadoop #tar -xvzf -C hadoop-2.7.1_3jia5.tar.gz /usr/local/ #ln -sv /usr/local/hadoop-2.7.1 /usr/local/hadoop 2、配置Hadoop 创建目录: cd /usr/local/hadoop mkdir tmp && mkdir -p hdfs/data && mkdir -p hdfs/name
修改配置文件: cd /usr/local/hadoop vim etc/hadoop/core-site.xml 在<configuration>中间插入: <property> <name>fs.defaultFS</name> <value>hdfs://hdfs1:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:///usr/local/hadoop/tmp</value> </property> <property> <name>io.file.buffer.size</name> <value>131702</value> </property>
vim etc/hadoop/hdfs-site.xml 在<configuration>中间插入: <property> <name>dfs.namenode.name.dir</name> <value>file:///usr/local/hadoop/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///usr/local/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property>
cp etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml vim etc/hadoop/mapred-site.xml 在<configuration>中间插入: <property> <name>mapreduce.framework.name</name> <value>yarn</value> <final>true</final> </property> <property> <name>mapreduce.jobtracker.http.address</name> <value>hdfs1:50030</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hdfs1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hdfs1:19888</value> </property> <property> <name>mapred.job.tracker</name> <value>http://hdfs1:9001</value> </property>
vim etc/hadoop/yarn-site.xml 在<configuration>中间插入: <property> <name>yarn.resourcemanager.hostname</name> <value>hdfs1</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hdfs1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hdfs1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hdfs1:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>hdfs1:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>hdfs1:8088</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>1</value> </property>
修改etc/hadoop/slaves
删除localhost,把所有的datanode添加到这个文件中。
[root@hdfs1 hadoop]# cat etc/hadoop/slaves
hdfs2
hdfs3
hdfs4
六、复制haoop到其他节点 将NameNode服务器上的hadoop整个copy到另外2个节点上(NameNode节点执行) scp -r hadoop-2.7.1 root@hdfs2:/usr/local/ scp -r hadoop-2.7.1 root@hdfs3:/usr/local/ scp -r hadoop-2.7.1 root@hdfs4:/usr/local/ 在所有节点上执行: ln -sv /usr/local/hadoop-2.7.1 /usr/local/hadoop 七、格式化(namenode节点执行) #cd /usr/local/hadoop #hdfs namenode -format 格式化namenode 八、启动Hadoop(namenode节点执行) 登陆到NameNode节点: cd /usr/local/hadoop/etc/hadoop 执行:start-all.sh(之前已经配置过hadoop的 环境变量了,所以直接执行就可以) 停止:stop-all.sh 九、验证 在所有节点上执行: #hadoop dfsadmin -report
出现类似以下结果: Live datanodes (3):
Name: 10.0.2.13:50010 (hdfs3) Hostname: hdfs3 Decommission Status : Normal Configured Capacity: 21001699328 (19.56 GB) DFS Used: 24576 (24 KB) Non DFS Used: 3104268288 (2.89 GB) DFS Remaining: 17897406464 (16.67 GB) DFS Used%: 0.00% DFS Remaining%: 85.22% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu Mar 03 16:37:47 CST 2016
Name: 10.0.2.14:50010 (hdfs4) Hostname: hdfs4 Decommission Status : Normal Configured Capacity: 21001699328 (19.56 GB) DFS Used: 24576 (24 KB) Non DFS Used: 3104006144 (2.89 GB) DFS Remaining: 17897668608 (16.67 GB) DFS Used%: 0.00% DFS Remaining%: 85.22% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu Mar 03 16:37:47 CST 2016
Name: 10.0.2.12:50010 (hdfs2) Hostname: hdfs2 Decommission Status : Normal Configured Capacity: 21001699328 (19.56 GB) DFS Used: 24576 (24 KB) Non DFS Used: 3104260096 (2.89 GB) DFS Remaining: 17897414656 (16.67 GB) DFS Used%: 0.00% DFS Remaining%: 85.22% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu Mar 03 16:37:47 CST 2016
在namenode节点上执行命令:jps [root@NameNode ~]# jps 7187 Jps 3493 NameNode 3991 SecondaryNameNode 4136 ResourceManager
在datanode节点上执行jps 9616 DataNode 9713 NodeManager 9812 Jps
访问hadoop的web页面: 管理界面:http://localhost:8088 NameNode界面:http://localhost:50070