从网上搜索配置教程,以下博文的作者写的非常好,完全可以实施,便将内容拷贝至此,加上适当的注解,以作备用。 http://www.micmiu.com/bigdata/hadoop/hadoop2x-single-node-setup/
本文是详细记录Hadoop 2.2.0 在Mac OSX系统下单节点安装配置启动的详细步骤,并且演示运行一个简单的job。目录结构如下:
- 基础环境配置
- Hadoop安装配置
- 启动及演示
[一]、基础环境配置
1、OS: Mac OSX 10.9.1
2、JDK 1.6.0_65
不管是安装包还是自己编译源码安装都可以,这个就不多介绍了,搜索下有很多文章介绍的,只要确保环境变量配置正确即可,我的JAVA_HOME配置如下:
1 micmiu-mbp:~ micmiu$ echo $JAVA_HOME
2 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
3 micmiu-mbp:~ micmiu$ java -version
4 java version "1.6.0_65"
5 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
6 Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
7 micmiu-mbp:~ micmiu$
(注:此处jdk的安装,我直接官网安装eclipse4.3,此时如果没装jdk,eclipse运行时会默认提示下载安装jdk,傻瓜式点击确认就可以了,默认安装1.6.0_65版本。
.bash_profile内容:
export CLICOLOR=1
export LSCOLORS=GxFxDxBxegedabagaced
export PS1="\[\e[0;31m\]\u@\h\[\e[0;33m\]:\[\e[1;34m\]\w \[\e[1;37m\]$ \[\e[m\]"
export HADOOP_HOME=~/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME=`/usr/libexec/java_home`
HADOOP_HOME的配置是为了输入命令方便,最好配置下。下面有更多详细参数配置亦可参照。
)
3、无密码SSH登录
由于是单节点的应用,只要实现localhost 的无密码ssh登录即可,这个比较简单:
micmiu-mbp:~ micmiu$ cd ~
micmiu-mbp:~ micmiu$ ssh-keygen -t rsa -P ''
micmiu-mbp:~ micmiu$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
验证是否成功:
micmiu-mbp:~ micmiu$ ssh localhost
Last login: Sat Jan 18 10:17:19 2014
micmiu-mbp:~ micmiu$
这样就表示SSH无密码登录成功了。
有关SSH无密码登录的详细介绍可以参见:Linux(Centos)配置OpenSSH无密码登陆
[二]、Hadoop安装配置
1、下载发布包
打开官方下载链接 http://hadoop.apache.org/releases.html#Download ,选择2.2.0版本的发布包下载 后解压到指定路径下:micmiu$ tar -zxf hadoop-2.2.0.tar.gz -C /usr/local/share
,那么本文中HADOOP_HOME = /usr/local/share/hadoop-2.2.0/。
2、配置系统环境变量 vi ~/.profile
,添加如下内容:
# Hadoop settings by Michael@micmiu.com
export HADOOP_HOME="/usr/local/share/hadoop-2.2.0"
export HADOOP_PREFIX=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_YARN_HOME=${HADOOP_PREFIX}
export HADOOP_CONF_DIR="$HADOOP_HOME/etc/hadoop/"
export YARN_CONF_DIR=${HADOOP_CONF_DIR}
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
3、修改 <HADOOP_HOME>/etc/hadoop/hadoop-env.sh
Mac OSX配置如下:
# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=$(/usr/libexec/java_home -d 64 -v 1.6)
#找到HADOOP_OPTS 配置增加下面参数
export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
跟多可以参见:$JAVA_HOME环境变量在Mac OS X中设置的问题
Linux|Unix 配置如下:
# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=系统中JDK实际路径
4、修改 <HADOOP_HOME>/etc/hadoop/core-site.xml
在<configuration>节点下添加或者更新下面的配置信息:
<!-- 新变量f:s.defaultFS 代替旧的:fs.default.name |micmiu.com-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/micmiu/tmp/hadoop</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.native.lib.available</name>
<value>false</value>
<description>default value is true:Should native hadoop libraries, if present, be used.</description>
</property>
5、修改 <HADOOP_HOME>/etc/hadoop/hdfs-site.xml
在<configuration>节点下添加或者更新下面的配置信息:
<property>
<name>dfs.replication</name>
<value>1</value>
<!-- 如果是单节点配置为1,如果是集群根据实际集群数量配置 | micmiu.com -->
</property>
6、修改 <HADOOP_HOME>/etc/hadoop/yarn-site.xml
在<configuration>节点下添加或者更新下面的配置信息:
<!-- micmiu.com -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
7、修改 <HADOOP_HOME>/etc/hadoop/mapred-site.xml
默认没有mapred-site.xml 文件,copy mapred-site.xml.template 一份为 mapred-site.xml即可
在<configuration>节点下添加或者更新下面的配置信息:
<!-- micmiu.com -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
[三]、启动及演示
1、启动Hadoop
首先执行hdfs namenode -format
:
micmiu-mbp:~ micmiu$ hdfs namenode -format
14/01/18 23:07:07 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = micmiu-mbp.local/192.168.1.103
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.2.0
.................
.................
.................
14/01/18 23:07:08 INFO util.GSet: VM type = 64-bit
14/01/18 23:07:08 INFO util.GSet: 0.029999999329447746% max memory = 991.7 MB
14/01/18 23:07:08 INFO util.GSet: capacity = 2^15 = 32768 entries
Re-format filesystem in Storage Directory /Users/micmiu/tmp/hadoop/dfs/name ? (Y or N) Y
14/01/18 23:07:26 INFO common.Storage: Storage directory /Users/micmiu/tmp/hadoop/dfs/name has been successfully formatted.
14/01/18 23:07:26 INFO namenode.FSImage: Saving image file /Users/micmiu/tmp/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
14/01/18 23:07:26 INFO namenode.FSImage: Image file /Users/micmiu/tmp/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 198 bytes saved in 0 seconds.
14/01/18 23:07:27 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
14/01/18 23:07:27 INFO util.ExitUtil: Exiting with status 0
14/01/18 23:07:27 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at micmiu-mbp.local/192.168.1.103
************************************************************/
然后执行 start-dfs.sh
:
micmiu-mbp:~ micmiu$ start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/share/hadoop-2.2.0/logs/hadoop-micmiu-namenode-micmiu-mbp.local.out
localhost: starting datanode, logging to /usr/local/share/hadoop-2.2.0/logs/hadoop-micmiu-datanode-micmiu-mbp.local.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/share/hadoop-2.2.0/logs/hadoop-micmiu-secondarynamenode-micmiu-mbp.local.out
micmiu-mbp:~ micmiu$ jps
1522 NameNode
1651 DataNode
1794 SecondaryNameNode
1863 Jps
micmiu-mbp:~ micmiu$
再执行 start-yarn.sh
:
micmiu-mbp:~ micmiu$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/share/hadoop-2.2.0/logs/yarn-micmiu-resourcemanager-micmiu-mbp.local.out
localhost: starting nodemanager, logging to /usr/local/share/hadoop-2.2.0/logs/yarn-micmiu-nodemanager-micmiu-mbp.local.out
micmiu-mbp:~ micmiu$ jps
2033 NodeManager
1900 ResourceManager
1522 NameNode
1651 DataNode
2058 Jps
1794 SecondaryNameNode
micmiu-mbp:~ micmiu$
启动日志没有错误信息,并确认上面的相关进程存在,就表示启动成功了。
2、演示
演示hdfs 一些常用命令,为wordcount演示做准备:
micmiu-mbp:~ micmiu$ hdfs dfs -ls /
micmiu-mbp:~ micmiu$ hdfs dfs -mkdir /user
micmiu-mbp:~ micmiu$ hdfs dfs -ls /
Found 1 items
drwxr-xr-x - micmiu supergroup 0 2014-01-18 23:20 /user
micmiu-mbp:~ micmiu$ hdfs dfs -mkdir -p /user/micmiu/wordcount/in
micmiu-mbp:~ micmiu$ hdfs dfs -ls /user/micmiu/wordcount
Found 1 items
drwxr-xr-x - micmiu supergroup 0 2014-01-18 23:21 /user/micmiu/wordcount/in
本地创建一个文件 micmiu-word.txt, 写入如下内容:
Hi Michael welcome to Hadoop
Hi Michael welcome to BigData
Hi Michael welcome to Spark
more see micmiu.com
把 micmiu-word.txt 文件上传到hdfs:hdfs dfs -put micmiu-word.txt /user/micmiu/wordcount/in
然后cd 切换到Hadoop的根目录下执行:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/micmiu/wordcount/in /user/micmiu/wordcount/out
ps: /user/micmiu/wordcount/out 目录不能存在 否则运行报错。
看到类似如下的日志信息:
micmiu-mbp:hadoop-2.2.0 micmiu$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/micmiu/wordcount/in /user/micmiu/wordcount/out
14/01/19 20:02:29 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/01/19 20:02:29 INFO input.FileInputFormat: Total input paths to process : 1
14/01/19 20:02:29 INFO mapreduce.JobSubmitter: number of splits:1
............
............
............
14/01/19 20:02:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390131922557_0001
14/01/19 20:02:30 INFO impl.YarnClientImpl: Submitted application application_1390131922557_0001 to ResourceManager at /0.0.0.0:8032
14/01/19 20:02:30 INFO mapreduce.Job: The url to track the job: http://micmiu-mbp.local:8088/proxy/application_1390131922557_0001/
14/01/19 20:02:30 INFO mapreduce.Job: Running job: job_1390131922557_0001
14/01/19 20:02:38 INFO mapreduce.Job: Job job_1390131922557_0001 running in uber mode : false
14/01/19 20:02:38 INFO mapreduce.Job: map 0% reduce 0%
14/01/19 20:02:43 INFO mapreduce.Job: map 100% reduce 0%
14/01/19 20:02:50 INFO mapreduce.Job: map 100% reduce 100%
14/01/19 20:02:50 INFO mapreduce.Job: Job job_1390131922557_0001 completed successfully
14/01/19 20:02:51 INFO mapreduce.Job: Counters: 43
File System Counters
FILE: Number of bytes read=129
FILE: Number of bytes written=158647
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=228
HDFS: Number of bytes written=83
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3346
Total time spent by all reduces in occupied slots (ms)=3799
Map-Reduce Framework
Map input records=4
Map output records=18
Map output bytes=179
Map output materialized bytes=129
Input split bytes=120
Combine input records=18
Combine output records=10
Reduce input groups=10
Reduce shuffle bytes=129
Reduce input records=10
Reduce output records=10
Spilled Records=20
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=30
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=283127808
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=108
File Output Format Counters
Bytes Written=83
micmiu-mbp:hadoop-2.2.0 micmiu$
到此 wordcount的job已经执行完成,执行如下命令可以查看刚才job的执行结果:
micmiu-mbp:hadoop-2.2.0 micmiu$ hdfs dfs -ls /user/micmiu/wordcount/out
Found 2 items
-rw-r--r-- 1 micmiu supergroup 0 2014-01-19 20:02 /user/micmiu/wordcount/out/_SUCCESS
-rw-r--r-- 1 micmiu supergroup 83 2014-01-19 20:02 /user/micmiu/wordcount/oummmicmiu-mbp:hadoop-2.2.0 micmiu$ hdfs dfs -cat /user/micmiu/wordcount/out/part-r-00000
BigData 1
Hadoop 1
Hi 3
Michael 3
Spark 1
micmiu.com 1
more 1
see 1
to 3
welcome 3