测试win版hadoop安装成功命令怎么测试hadoop已经安装成功

转载

mob64ca140eb362 2024-07-10 22:09:43

文章标签 测试win版hadoop安装成功命令 hadoop mapreduce input output 文章分类 Hadoop 大数据

操作系统

操作系统使用 Ubuntu 11.04 桌面版。设置 root 密码： sudo passwd root

下载 jdk,jdk-6u27-linux-i586.bin

下载 hadoop, hadoop-0.20.204.0.tar.gz

安装 JDK

安装 java6, 将 JDK 复制到 /usr/local 目录下，使用下面命令安装

sudo sh jdk-6u27-linux-i586.bin

设置 JDK 环境变量

sudo gedit /etc/environment

增加 PATH 、增加 export JAVA_HOME 、增加 export CLASSPATH

创建用户组及用户

创建用户组 hadoop ，创建用户 hadoop

sudo addgroup hadoop
 
sudo adduser --ingroup hadoop hadoop

配置 SSH

下载 SSH SERVER:sudo apt-get install openssh-server

生成 SSH 证书：

1. 转换到 hadoop 用户下： su hadoop

2. 生成空密码的 SSH 证书： ssh-keygen -t rsa -P ""

要求输入文件名时，直接回车 , 会生成 .ssh 文件

完成后，测试： ssh localhost

安装 hadoop

复制 hadoop 安装文件到 /usr/local

sudo cp hadoop-0.20.204.0.tar.gz /usr/local

解压 hadoop 文件

sudo tar xzf hadoop-0.20.204.0.tar.gz

解压后生成目录 hadoop-0.20.204.0 ，更改目录名称为 hadoop ，方便使用

sudo mv hadoop-0.20.204.0 hadoop

给目录 hadoop 增加执行权限

sudo chown -R hadoop:hadoop hadoop

配置 hadoop

打开 hadoop/conf/core-site.xml

sudo gedit core-site.xml

增加以下内容 :

1. 增加临时内容存放目录，最好建在 hadoop 用户下，如果在 /usr/local 下，执行 hadoop 时，会没有权限建立临时目录 ,

<property>
 
         <name>hadoop.tmp.dir</name>
 
         <value>/home/hadoop/hadoop-datastore</value>
 
         <description>A base for other temporary directories.</description>
 
</property>

2. 增加 namenode 节点

<property>
 
         <name>fs.default.name</name>
 
         <value>hdfs:localhost:54310</value>
 
         <description>The name of the default file system.  A URI whose
 
  scheme and authority determine the FileSystem implementation.  The
 
  uri's scheme determines the config property (fs.SCHEME.impl) naming
 
  the FileSystem implementation class.  The uri's authority is used to
 
  determine the host, port, etc. for a filesystem.</description>
 
  </property>
 
</property>

打开 mapred-site.xml, 增加 MapReduce job tracker 运行的主机和端口

<property>
 
  <name>mapred.job.tracker</name>
 
  <value>localhost:54311</value>
 
  <description>The host and port that the MapReduce job tracker runs
 
  at.  If "local", then jobs are run in-process as a single map
 
  and reduce task.
 
  </description>
 
</property>

打开 hdfs-site.xml

<property>
 
  <name>dfs.replication</name>
 
  <value>1</value>
 
  <description>Default block replication.
 
  The actual number of replications can be specified when the file is created.
 
  The default is used if replication is not specified in create time.
 
  </description>
 
</property>

格式化命名节点

bin/hadoop namenode -format

启动 hdfs 和 MapReduce ： bin/start-all.sh

停止服务 :bin/stop-all.sh

使用 jps 命令查看运行的 hadoop 进程

查看集群状态命令 :bin/hadoop dfsadmin -report

使用 web 方式查看：

1.hdfs 的 WEB 页面： http://localhost:50070

2.MapReduce 的 WEB 页面： http://localhost:50030

测试 hadoop

使用 hadoop 所附带的例程，测试文件中的单词重复数

可以在 /home/hadoop 中建立 input 目录，创建两个文本文档，内容是不同的单词，文件名为 file01.txt,file02.txt