部署kafka之前需先部署zookeeper.
1 安装ZOOKEEPER
1.1 安装包
zookeeper-3.4.10.tar.gz
1.2 部署
下述操作为在单机上安装zookeeper伪集群,也可直接安装单机版。
实际安装路径视具体情况而定
解压:
tar -zxf zookeeper-3.4.10.tar.gz
创建文件夹
cd /具体安装路径
mkdir zookeeper-cluster
#3个子目录分别对应三个zookeeper服务
mkdir zookeeper-cluster/server1
mkdir zookeeper-cluster/server2
mkdir zookeeper-cluster/server3
#建立三个目录存放各自的数据文件
mkdir zookeeper-cluster/data
mkdir zookeeper-cluster/data/server1
mkdir zookeeper-cluster/data/server2
mkdir zookeeper-cluster/data/server3
#建立三个目录存放各自的日志文件
mkdir zookeeper-cluster/log
mkdir zookeeper-cluster/log/server1
mkdir zookeeper-cluster/log/server2
mkdir zookeeper-cluster/log/server3
#在每一个数据文件目录中,新建一个myid文件,文件必须是唯一的服务标识,在后面的配置中会用到
echo '1' > zookeeper-cluster/data/server1/myid
echo '2' > zookeeper-cluster/data/server2/myid
echo '3' > zookeeper-cluster/data/server3/myid
• #将zookeeper复制三份
cp -rf zookeeper-3.4.10/* /具体路径/zookeeper-cluster/server1
cp -rf zookeeper-3.4.10/* /具体路径/zookeeper-cluster/server2
cp -rf zookeeper-3.4.10/* /具体路径/zookeeper-cluster/server3
复制出zoo.cfg文件:
cp zookeeper-cluster/server1/zoo_sample.cfg zookeeper-cluster/server1/zoo.cfg
cp zookeeper-cluster/server2/zoo_sample.cfg zookeeper-cluster/server2/zoo.cfg
cp zookeeper-cluster/server3/zoo_sample.cfg zookeeper-cluster/server3/zoo.cfg
选择行修改zoo.cfg
#将dataDir和dataLogDir设置为各自独立的目录;然后保证clientPort不会和其它zookeeper冲突
#存储路径
dataDir= /具体路径/zookeeper-cluster/data/server1
#方便查LOG
dataLogDir=/具体路径/zookeeper-cluster/log/server1
#控制客户的连接数,默认数为60,太少
maxClientCnxns=300
#如果有多个ZOOKEEPER INSTANCE时
#server.X=IP:port1:port2
#X是在该zookeeper数据文件目录中myid指定的服务ID.
#IP是当前zookeeper绑定的IP地址,因为是演示,所以全都是localhost
#port1 是Quorum Port
#port2 是Leader Election Port
#由于3个zookeeper在同一台机器上,需要使用不同的端口号避免冲突。
server.1=0.0.0.0:2888:3888
server.2=0.0.0.0:12888:13888
server.3=0.0.0.0:22888:23888
修改后的结果如下
/opt/zookeeper-cluster/server1/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/实际路径/zookeeper-cluster/data/server1
dataLogDir=/实际路径/zookeeper-cluster/log/server1
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
#http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=0.0.0.0:2888:3888
server.2=0.0.0.0:12888:13888
server.3=0.0.0.0:22888:23888
/opt/zookeeper-cluster/server2/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/实际路径/zookeeper-cluster/data/server2
dataLogDir=/实际路径/zookeeper-cluster/log/server2
# the port at which the clients will connect
clientPort=12181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
#http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=0.0.0.0:2888:3888
server.2=0.0.0.0:12888:13888
server.3=0.0.0.0:22888:23888
/opt/zookeeper-cluster/server3/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/实际路径/zookeeper-cluster/data/server3
dataLogDir=/实际路径/zookeeper-cluster/log/server3
# the port at which the clients will connect
clientPort=22181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
#http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=0.0.0.0:2888:3888
server.2=0.0.0.0:12888:13888
server.3=0.0.0.0:22888:23888
1.3 启动
然后分别启动3个zookeeper服务
$ /实际路径/zookeeper-cluster/server1/bin/ start
$ /实际路径/zookeeper-cluster/server2/bin/ start
$ /实际路径/zookeeper-cluster/server3/bin/ start
启动完成后查看每个服务的状态
$ /实际路径/zookeeper-cluster/server1/bin/ status
$ /实际路径/zookeeper-cluster/server2/bin/ status
$ /实际路径/zookeeper-cluster/server3/bin/ status
2 安装KAFKA
2.1安装包:
kafka_2.11-0.10.2.0.tgz (版本号中2.11表示scala的版本,0.10.2.0才表示kafka版本)
2.2 部署
解压
tar zxvf kafka_2.11-0.10.2.0.tgz
修改配置文件
vi /实际路径/kafka_2.11-0.10.2.0/config/server.properties
#配置zookeeper连接,默认Kafka会使用ZooKeeper默认的/路径,这样有关Kafka的ZooKeeper配置就会散落在根路径下面,如果你有其他的应用也在使用ZooKeeper集群,查看ZooKeeper中数据可能会不直观,所以强烈建议指定一个chroot路径,直接在zookeeper.connect配置项中指定
zookeeper.connect=localhost:2181,localhost:12181,localhost:22181/kafka
#每个Kafka Broker应该配置一个唯一的ID
broker.id=0
#因为是在同一台机器上开多个Broker,所以使用不同的端口号区分
port=9092
#如果有多个网卡地址,也可以将不同的Broker绑定到不同的网卡
=localhost
#因为是在同一台机器上开多个Broker,需要确保使用不同的日志目录来避免冲突
log.dirs=/tmp/kafka-logs-1
#设置topic可删除,默认该项是被注释的,需要放开
delete.topic.enable=true
#关闭自动创建topic,默认情况下Producer往一个不存在的Topic发送message时会自动创建这个Topic
auto.create.topics.enable=false
cp /实际路径/kafka_2.11-0.10.2.0/config/server.properties /实际路径/kafka_2.11-0.10.2.0/config/server-2.properties
编辑server-2.properties
vi /实际路径/kafka_2.11-0.10.2.0/config/server-2.properties
#配置zookeeper连接,默认Kafka会使用ZooKeeper默认的/路径,这样有关Kafka的ZooKeeper配置就会散落在根路径下面,如果你有其他的应用也在使用ZooKeeper集群,查看ZooKeeper中数据可能会不直观,所以强烈建议指定一个chroot路径,直接在zookeeper.connect配置项中指定
zookeeper.connect=localhost:2181,localhost:12181,localhost:22181/kafka
#每个Kafka Broker应该配置一个唯一的ID
broker.id=1
#因为是在同一台机器上开多个Broker,所以使用不同的端口号区分
port=19092
#如果有多个网卡地址,也可以将不同的Broker绑定到不同的网卡
=localhost
#因为是在同一台机器上开多个Broker,需要确保使用不同的日志目录来避免冲突
log.dirs=/tmp/kafka-logs-2
#设置topic可删除,默认该项是被注释的,需要放开
delete.topic.enable=true
#关闭自动创建topic,默认情况下Producer往一个不存在的Topic发送message时会自动创建这个Topic
auto.create.topics.enable=false
注:由于zookeeper.connect指定了chroot为/kafka,所以在后面操作时连接zookeeper都需要加上/kafka,如查看topic属性
bin/kafka-topics.sh --describe --zookeeper localhost:2181/kafka --topic test
如果还需配置多个节点,只需新增server.properties配置文件即可。
2.3 启动
kafka_2.11-0.10.2.0/bin/ kafka_2.11-0.10.2.0/config/server.properties &
kafka_2.11-0.10.2.0/bin/ kafka_2.11-0.10.2.0/config/server-2.properties &
2.4 关闭
kafka_2.11-0.10.2.0/bin/
2.5 创建topic
#新创建一个topic, replication-factor表示该topic需要在不同的broker中保存几份,这里replication-factor设置为2, 表示在两个broker中保存
kafka_2.11-0.10.2.0/bin/kafka-topics.sh -topic test -create -partitions 1 -replication-factor 2 -zookeeper localhost:2181/kafka
2.6 查看指定topic属性
kafka_2.11-0.10.2.0/bin/kafka-topics.sh -describe -zookeeper localhost:2181/kafka -topic test
2.7 查看topic列表
#通过list命令查看创建的topic
kafka_2.11-0.10.2.0/bin/kafka-topics.sh -list -zookeeper localhost:2181/kafka
2.8 创建生产者
#新建连接
kafka_2.11-0.10.2.0/bin/ -broker-list localhost:9092 -topic test
2.9 创建消费者
#重新打开一个ssh连接执行以下命令
kafka_2.11-0.10.2.0/bin/kafka-console-consumer.sh -zookeeper localhost:2181/kafka -topic test -from-beginning
2.10 测试
在生成者连接中输入内容.
test
在消费者连接中查看是否接收到消息
3 KAFKA常用命令
3.1修改topic
./bin/kafka-topics.sh -zookeeper localhost:2181/kafka -alter -topic test -partitions 2
3.2删除topic
./bin/kafka-topics.sh -zookeeper localhost:2181/kafka -delete -topic test
4 KAFKA相关配置参数
配置文件在config/server.properties
下面的一些配置可能是你需要进行修改的。
broker.id 整数,建议根据ip区分
log.dirs kafka存放消息文件的路径, 默认/tmp/kafka-logs
port broker用于接收producer消息的端口
zookeeper.connnect zookeeper连接 格式为 ip1:port,ip2:port,ip3:port
message.max.bytes 单条消息的最大长度
num.network.threads broker用于处理网络请求的线程数 如不配置默认为3,server.properties默认是2
.threads broker用于执行网络请求的IO线程数 如不配置默认为8,server.properties默认是2可适当增大,
queued.max.requests 排队等候IO线程执行的requests 默认为500
broker的hostname 默认null,建议写主机的ip,不然消费端不配置hosts会有麻烦
num.partitions topic的默认分区数 默认1
log.retention.hours 消息被删除前保存多少小时 默认1周168小时
auto.create.topics.enable 是否可以程序自动创建Topic 默认true,建议false
default.replication.factor 消息备份数目 默认1不做复制,建议修改
num.replica.fetchers 用于复制leader消息到follower的IO线程数 默认1