下载Hadoop安装包

这里安装版本:hadoop-1.0.4.tar.gz

第一步:在安装Hadoop之前,服务器上一定要安装的jdk

jdk安装:

方法一:

在官网上下载Linux下的rpm安装包:jdk-8u181-linux-x64.rpm

通过Xshell将安装包上传到服务器:(可在Xshell使用rz命令)

安装命令rpm -ivh jdk-8u181-linux-x64.rpm

方法二:

使用yum安装:安装命令:yum -y install java-1.8.0-openjdk-devel.i686

 

配置jdk安装路径(这里以第二种方法为准)

方法二安装jdk的默认安装路径为:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.i386

$ vi /etc/profile

#将以下内容拷贝到/etc/profile中

    JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.i386

    JRE_HOME=$JAVA_HOME/jre

    CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

    PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

    export JAVA_HOME JRE_HOME CLASS_PATH PATH

$chmod +x /etc/profile

$source /etc/profile

#检查jdk是否安装成功:

[root@aliyun-002:/opt/package]$java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK Server VM (build 25.212-b04, mixed mode)

#能够输出jdk的版本信息,说明已经安装成功

 

 

第二步:建立SSH无密连接

#一路回车

[root@localhost opt]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:NmlTPEeHk9I1XdHtJ5x5h/FmYP2jXcyJM1sWr+LZloo root@localhost.localdomain
The key's randomart image is:
+---[RSA 2048]----+
|           ..+++*|
|         ...=.+o=|
|          +..+.@=|
|         o o +*=#|
|        S     O*=|
|       o o  .o.. |
|           . + . |
|           .o +  |
|          E .o   |
+----[SHA256]-----+

[root@localhost opt]$ cd /root/.ssh/
[root@localhost.ssh]$ cp id_rsa.pub authorized_keys

#检查是否配置成功:如果执行以下命令,不用输入登录密码,表示配置成功
[root@localhost:~/.ssh]$ ssh localhost
Last login: Tue May 14 15:19:17 2019 from 127.0.0.1

Welcome to Alibaba Cloud Elastic Compute Service !

 

第三步:解压安装包,配置相关文件 

#解压到当前目录下
tar -xvf /opt/package/hadoop-2.7.7.tar.gz   
mv hadoop-2.7.7/ hadoop

#将hadoop的安装路径写到配置文件profile中:
vi /etc/profile
 export HADOOP_HOME=/opt/hadoop
 export PATH=$HADOOP_HOME/bin:$PATH
#执行以下命令,使配置路径立即生效
source /etc/profile

修改配置文件:

需要配置的文件都在:/opt/hadoop/etc/hadoop路径下:包含:hadoop-env.sh,core-site.xml,hdfs-site.xml,mapred-site.xmlHadoop的版本不是1.0.*的修改的配置文件可能不在该路径下,找到修改即可。版本2.0一上在:hadoop安装路径/etc/hadoop/)

修改配置文件:路径:/opt/hadoop/etc/hadoop/hadoop-env.sh

vi /opt/hadoop/etc/hadoop/hadoop-env.sh
#找到JAVA_HOME的位置,将其后的${JAVA_HOME}修改为:java的安装路径。如下:
****
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.i386
****

#修改:core-site.xml

vi /opt/hadoop/etc/hadoop/core-site.xml
#增加如下配置:
<configuration>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop</value>
</property>

</configuration>

#修改:hdfs-site.xml

vi /opt/hadoop/etc/hadoop/hdfs-site.xml
#增加如下配置:
<configuration>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

</configuration>

#修改mapred-site.xml(2.0版本没有这个文件,可修改 mapred-site.xml.template)

#如果没有mapred-site.xml文件,可复制mapred-site.xml.template
cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml
#增加如下内容
<configuration>

<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>

</configuration>

 

第4步、创建Hadoop的数据存储路径:

mkdir /var/lib/hadoop

chmod 777 /var/lib/hadoop

第5步、格式化HDFS文件系统

$ hadoop namenode -format

#以下为输出内容

12/10/26 22:45:25 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG: host = vm193/10.0.0.193

STARTUP_MSG: args = [-format]

…1

2/10/26 22:45:25 INFO namenode.FSNamesystem:  fsOwner=hadoop,hadoop

12/10/26 22:45:25 INFO namenode.FSNamesystem:  supergroup=supergroup

12/10/26 22:45:25 INFO namenode.FSNamesystem:  isPermissionEnabled=true

12/10/26 22:45:25 INFO common.Storage: Image file of size 96  saved in 0 seconds.

12/10/26 22:45:25 INFO common.Storage: Storage directory  /var/lib/hadoophadoop/

dfs/name has been successfully formatted.

12/10/26 22:45:26 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at vm193/10.0.0.193

$

 

第6步、启动Hadoop:

$./sbin/start-all.sh

$ jps
23968 NodeManager
23553 DataNode
23874 ResourceManager
23715 SecondaryNameNode
23429 NameNode
24278 Jps

#有以上内容表示启动成功

第7步、Hadoop的简单使用:

如果一下命令能够正常使用,Hadoop已经算是配置完成。

$ hdfs dfs -mkdir /user
$ hdfs dfs -mkdir /user/hadoop
$ hdfs fs -ls /user
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2012-10-26 23:09 /user/Hadoop
$ echo "This is a test." >> test.txt
$ cat test.txt
This is a test.
$ hdfs dfs -copyFromLocal test.txt .
$ hdfs dfs -ls
Found 1 items
-rw-r--r-- 1 hadoop supergroup 16 2012-10-26
23:19/user/hadoop/test.txt
$ hdfs dfs -cat test.txt
This is a test.
$ rm test.txt
$ hdfs dfs -cat test.txt
This is a test.
$ hdfs fs -copyToLocal test.txt
$ cat test.txt
This is a test.

注:警告:

如果执行以上命令出现以下警告:

hbase的resgionserver启动就挂 启动hbase服务_hadoop

解决方法:

在hadoop-env.sh和yarn-env.sh中添加如下两行:

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native  
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

#HADOOP_HOME:为Hadoop的安装路径

 

补充:Hadoop的监控:

在浏览器地址栏输入:http://localhost:50030    #一般监控和Hadoop不在一台服务器上,需要将localhost改成安装Hadoop的服务器的ip地址

3.0版本的端口修改为:

hdfs的web页面默认端口是9870 yarn的web页面端口是8088

非原创,步骤来之书籍《Hadoop.Data.Processing.and.Modelling》

 

安装Hbase过程

第一步:

 hbase版本可到以下网址中下载:http://mirror.bit.edu.cn/apache/hbase

解压:tar -xvf hbase-2.0.2-bin.tar.gz

我的hbase安装路径为:/opt/hbase

第二步:

把hbase的安装路径增加到启动文件中:hadoop-env.sh(或者在/etc/profile里增加)

export HBASE_HOME=/opt/hbase-2.0.2

export PATH=$HBASE_HOME/bin:${PATH}

执行脚本文件:

source hadoop-env.sh

 

第三步:

#配置Hbase:
  vi hbase-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64/
export HBASE_CLASSPATH=/opt/hbase-2.0.2/conf
export HBASE_MANAGES_ZK=true

   vi hbase-site.xml

<configuration>

<property>
  <name>hbase.rootdir</name>
  <value>hdfs://localhost:9000/hbase</value>
</property>

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>localhost</value>
</property>

<property>
  <name>hbase.tmp.dir</name>
  <value>/root/hbase/tmp</value>
</property>

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>


</configuration>

增加,将Hadoop安装目录下的配置文件hdfs-site.xml和core-site.xml(Hadoop 2.7.7在hadoop/etc/hadoop/)拷贝到Hbase的配置文件目录conf/下

在conf/hbase-env.sh增加:

export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64/
export HBASE_CLASSPATH=/opt/hbase-2.0.2/conf
export HBASE_MANAGES_ZK=true

图片:

hbase的resgionserver启动就挂 启动hbase服务_xml_02

 

第四步:

启动Hbase:
./bin/start-hbase.sh


#需要有这些进程

[root@localhost hbase-2.0.2]# jps
1764 NameNode
6837 HQuorumPeer
7045 Jps
2246 ResourceManager
3801 HRegionServer
6905 HMaster
2075 SecondaryNameNode
1868 DataNode
2349 NodeManager



#进入Hbase Shell:

hbase shell

[root@localhost hbase-2.0.2]# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.2, r1cfab033e779df840d5612a85277f42a6a4e8172, Tue Aug 28 20:50:40 PDT 2018
Took 0.0185 seconds                                                                          

hbase(main):001:0> list
TABLE
0 row(s)
Took 2.2076 seconds                                                                          
=> []
hbase(main):002:0> create 'member', 'm_id', 'address', 'info'
Created table member
Took 2.2952 seconds                                                                          
=> Hbase::Table - member
hbase(main):003:0> list 'member'
TABLE                                                                                    
member                                                                                   
1 row(s)
Took 0.0371 seconds                                                                          
=> ["member"]
hbase(main):004:0> list
TABLE                                                                                    
member                                                                                   
1 row(s)
Took 0.0324 seconds                                                                          
=> ["member"]

hbase(main):005:0> exit
#退出Hbase Shell

 

 

总结:(有问题,大家可以留言讨论)

遇到的问题:

error:

1、error1

[root@localhost hadoop]# ./sbin/start-dfs.sh

Starting namenodes on [localhost]

ERROR: Attempting to operate on hdfs namenode as root

ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.

Starting datanodes

ERROR: Attempting to operate on hdfs datanode as root

ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.

Starting secondary namenodes [localhost.localdomain]

ERROR: Attempting to operate on hdfs secondarynamenode as root

ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.

 

解决1: 是因为缺少用户定义造成的,所以分别编辑开始和关闭脚本 

$ vim sbin/start-dfs.sh

$ vim sbin/stop-dfs.sh

在顶部空白处添加内容: 

HDFS_DATANODE_USER=root

HADOOP_SECURE_DN_USER=hdfs

HDFS_NAMENODE_USER=root

HDFS_SECONDARYNAMENODE_USER=root

 

或者

在hadoop-env.sh中添加一下脚本:(注:hadoop-env.sh为自己编写的脚本文件,声明一些环境变量,启动Hadoop前需要执行hadoop-env.sh)

export HDFS_NAMENODE_USER="root"

export HDFS_DATANODE_USER="root"

export HDFS_SECONDARYNAMENODE_USER="root"

export YARN_RESOURCEMANAGER_USER="root"

export YARN_NODEMANAGER_USER="root"

 

Qerror2:(和error1是一类问题)

Q2:

Starting resourcemanager

ERROR: Attempting to launch yarn resourcemanager as root

ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting launch.

Starting nodemanagers

ERROR: Attempting to launch yarn nodemanager as root

ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting launch.

解决2:

是因为缺少用户定义造成的,所以分别编辑开始和关闭脚本

$ vim sbin/start-yarn.sh

$ vim sbin/stop-yarn.sh

添加内容:

YARN_RESOURCEMANAGER_USER=root

HADOOP_SECURE_DN_USER=yarn

YARN_NODEMANAGER_USER=root

 

hadoop-env.sh 文件中增加

export JAVA_HOME=你的java路径

 

2、error3

[root@localhost sbin]# ./start-dfs.sh

ls: Call From localhost/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

Starting namenodes on [localhost]

Last login: Wed Oct 17 07:53:07 EDT 2018 from 172.16.7.1 on pts/1

/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found

ERROR: JAVA_HOME is not set and could not be found.

Starting datanodes

Last login: Wed Oct 17 07:54:50 EDT 2018 on pts/0

/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found

ERROR: JAVA_HOME is not set and could not be found.

Starting secondary namenodes [localhost.localdomain]

Last login: Wed Oct 17 07:54:50 EDT 2018 on pts/0

/opt/hadoop/etc/hadoop/hadoop-env.sh: line 37: hdfs: command not found

ERROR: JAVA_HOME is not set and could not be found.

 

解决方法:

修改配置文件hadoop-env.sh(这个是解压后就有的,不是自己建的文件)

我的安装目录是:/opt/hadoop/

hadoop-1.*.*.tar.gz此版本的文件在:hadoop安装目录/conf/hadoop-env.sh

1.0以上版本文件在:hadoop安装目录/etc/hadoop/hadoop-env.sh

 

vi /opt/hadoop/etc/hadoop/hadoop-env.sh

增加JAVA_HOME。例如画红线处:=后换成你jdk的安装目录

hbase的resgionserver启动就挂 启动hbase服务_xml_03

 

修改环境变量:

sudo vi ~/.bashrc

文件的末尾追加下面内容:

 

#set oracle jdk environment

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_151 ## 这里要注意目录要换成自己解压的jdk 目录

export JRE_HOME=${JAVA_HOME}/jre

export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib

export PATH=${JAVA_HOME}/bin:$PATH

使环境变量马上生效

source ~/.bashrc