前置准备

Zookeeper的运行依赖JDK,需要预先安装,安装步骤见:Linux下jdk的安装

Hadoop前置准备:Hadoop前置准备

一、单机环境搭建

1.1 下载

下载对应版本 Zookeeper,这里我下载的版本 ​​3.5.7​​​。官方下载地址:https://archive.apache.org/dist/zookeeper/

1.2 解压

[xiaokang@hadoop01 ~]$ tar -zxvf apache-zookeeper-3.5.7-bin.tar.gz -C /opt/software/
#重命名
[xiaokang@hadoop01 ~]$ mv /opt/software/apache-zookeeper-3.5.7-bin/ /opt/software/zookeeper-3.5.7

1.3 配置环境变量

编辑 ​​profile​​ 文件:

[xiaokang@hadoop01 ~]$ sudo vim /etc/profile

在原来基础上更新配置环境变量:

export JAVA_HOME=/opt/moudle/jdk1.8.0_191
export JRE_HOME=${JAVA_HOME}/jre
export HADOOP_HOME=/opt/software/hadoop-2.7.7
export ZOOKEEPER_HOME=/opt/software/zookeeper-3.5.7
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${ZOOKEEPER_HOME}/bin:$PATH

使得配置的环境变量生效:

[xiaokang@hadoop01 ~]$ source /etc/profile

1.4 修改配置

进入安装目录的 ​​conf/​​ 目录下,拷贝配置样本并进行修改:

[xiaokang@hadoop01 ~]$ cd $ZOOKEEPER_HOME/conf
[xiaokang@hadoop01 conf]$ cp zoo_sample.cfg zoo.cfg
[xiaokang@hadoop01 conf]$ vim zoo.cfg

指定数据存储目录和日志文件目录,修改后完整配置如下:

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/software/zookeeper-3.5.7/zoo_data
dataLogDir=/opt/software/zookeeper-3.5.7/zoo_logs
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1


配置参数说明:


  • tickTime:用于计算的基础时间单元。比如 session 超时:N*tickTime;
  • initLimit:用于集群,允许从节点连接并同步到 master 节点的初始化连接时间,以 tickTime 的倍数来表示;
  • syncLimit:用于集群, master 主节点与从节点之间发送消息,请求和应答时间长度(心跳机制);
  • dataDir:数据存储位置;
  • dataLogDir:日志目录;
  • clientPort:用于客户端连接的端口,默认 2181


1.5 启动

使用下面命令启动即可:

[xiaokang@hadoop01 ~]$ zkServer.sh start

1.6 验证

使用命令​​zkServer.sh status​​​使用 JPS 验证进程是否已经启动,出现 ​​standalone​​​ 或 ​​QuorumPeerMain​​ 则代表启动成功。

[xiaokang@hadoop01 ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper-3.5.7/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: standalone
[xiaokang@hadoop01 ~]$ jps
2179 QuorumPeerMain
2245 Jps

二、集群环境搭建

为保证集群高可用,Zookeeper 集群的节点数最好是奇数,最少有三个节点,所以这里演示搭建一个三个节点的集群。这里我使用三台主机进行搭建,主机名分别为 hadoop01,hadoop02,hadoop03。

2.1 修改配置

解压一份 Zookeeper 安装包,修改其配置文件 ​​zoo.cfg​​,内容如下。之后使用 scp 命令将安装包分发到三台服务器上:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/software/zookeeper-3.5.7/zoo_data
dataLogDir=/opt/software/zookeeper-3.5.7/zoo_logs
clientPort=2181

# server.1 这个1是服务器的标识,可以是任意有效数字,标识这是第几个服务器节点,这个标识要写到dataDir目录下面myid文件里
# 指名集群间通讯端口和选举端口
server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop03:2888:3888

2.2 标识节点

分别在三台主机的 ​​dataDir​​​ 目录下新建 ​​myid​​​ 文件,并写入对应的节点标识。Zookeeper 集群通过 ​​myid​​ 文件识别集群节点,并通过上文配置的节点通信端口和选举端口来进行节点通信,选举出 Leader 节点。

创建存储目录:

# 三台主机均执行该命令
mkdir /opt/software/zookeeper-3.5.7/zoo_data
mkdir /opt/software/zookeeper-3.5.7/zoo_logs

创建并写入节点标识到 ​​myid​​ 文件:

# hadoop01主机
echo "1" > /opt/software/zookeeper-3.5.7/zoo_data/myid
# hadoop02主机
echo "2" > /opt/software/zookeeper-3.5.7/zoo_data/myid
# hadoop03主机
echo "3" > /opt/software/zookeeper-3.5.7/zoo_data/myid

2.3 启动集群

分别在三台主机上(最好都配置好Zookeeper的环境变量),执行如下命令启动服务:

zkServer.sh start

2.4 集群验证

启动后使用 ​​zkServer.sh status​​ 查看集群各个节点状态。如图所示:三个节点进程均启动成功,并且 hadoop02 为 leader 节点,hadoop01 和 hadoop03 为 follower 节点。

Zookeeper单机环境和集群环境搭建_大数据Zookeeper单机环境和集群环境搭建_分布式_02Zookeeper单机环境和集群环境搭建_apache_03