双主双从模式集群

在生产环境中为了保障集群无单点故障问题,保证高可用性,需要采用双主双从模式来构建RocketMQ集群。双主双从模式部署需要四台机器,两台机器分别部署Broker-Master & NameServer,另外两台机器分别部署Broker-Slave & NameServer。

RocketMQ双主双从模式集群拓扑图:


环境准备

机器说明

由于我们搭建的是双主双从模式,所以首先需要准备四台机器,如下表所示:

| 机器IP | hostname | 角色 | 内存 | CPU | | :-------- | :--------| :------ | | 192.168.243.169 | rocketmq01 | NameServer & Master1 | 4G | 2核 | | 192.168.243.170 | rocketmq02 | NameServer & Master2 | 4G | 2核 | | 192.168.243.171 | rocketmq03 | NameServer & Master1的Slave | 4G | 2核 | | 192.168.243.172 | rocketmq04 | NameServer & Master2的Slave | 4G | 2核 |

配置这四台机器的hosts文件如下:

$ vim /etc/hosts
192.168.243.169 rocketmq01 rocketmq-nameserver1 rocketmq-master1
192.168.243.170 rocketmq02 rocketmq-nameserver2 rocketmq-master2
192.168.243.171 rocketmq03 rocketmq-nameserver3 rocketmq-master1-slave1
192.168.243.172 rocketmq04 rocketmq-nameserver4 rocketmq-master2-slave1

编译安装RocketMQ

我们需要在所有机器上安装RocketMQ,这里以rocketmq01节点为例做演示。第一步是在所有机器上准备好Java和Maven环境:

[root@rocketmq01 ~]# java -version
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)
[root@rocketmq01 ~]# mvn -v
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/maven
Java version: 1.8.0_261, vendor: Oracle Corporation, runtime: /usr/local/jdk/1.8/jre
Default locale: zh_CN, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-1062.el7.x86_64", arch: "amd64", family: "unix"
[root@rocketmq01 ~]# 
  • **Tips:**最好是使用JDK1.8,因为目前版本的RocketMQ的启动脚本都是基于1.8的,使用高版本的JDK需要自己去修改启动脚本比较麻烦

根据官方文档的描述下载最新版本的源码包:

  • http://rocketmq.apache.org/docs/quick-start/

然后上传到服务器:

[root@rocketmq01 /usr/local/src]# ls
rocketmq-all-4.7.1-source-release.zip
[root@rocketmq01 /usr/local/src]# 

解压源码包:

[root@rocketmq01 /usr/local/src]# unzip rocketmq-all-4.7.1-source-release.zip
[root@rocketmq01 /usr/local/src]# cd rocketmq-all-4.7.1-source-release
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# ls
acl     BUILDING  common           dev           docs     filter   logappender  namesrv  openmessaging  README.md  srvutil  style  tools
broker  client    CONTRIBUTING.md  distribution  example  LICENSE  logging      NOTICE   pom.xml        remoting   store    test
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# 

RocketMQ源码包结构说明:

  • broker:主要的业务逻辑,消息收发,主从同步,pagecache
  • client:客户端接口,比如生产者和消费者
  • example:示例,比如生产者和消费者
  • common:公用数据结构等等
  • distribution:编译模块,编译输出等
  • fliter:进行Broker过滤的不感兴趣的消息传输,减小带宽压力
  • logappender、logging:日志相关
  • namesrv:Namesrver服务,用于服务协调
  • openmessaging:对外提供服务
  • remoting:远程调用接口,封装Netty底层通信
  • srvutil:提供-些公用的工具方法,比如解析命令行参数
  • store:消息存储
  • tools:管理工具,比如有名的mqadmin工具

然后使用如下命令对源码进行编译:

[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# mvn -Prelease-all -DskipTests clean install -U

编译成功最后会输出如下内容,所有模块的编译结果都是SUCCESS状态:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache RocketMQ 4.7.1 4.7.1:
[INFO] 
[INFO] Apache RocketMQ 4.7.1 .............................. SUCCESS [04:15 min]
[INFO] rocketmq-logging 4.7.1 ............................. SUCCESS [ 24.843 s]
[INFO] rocketmq-remoting 4.7.1 ............................ SUCCESS [ 13.108 s]
[INFO] rocketmq-common 4.7.1 .............................. SUCCESS [ 26.304 s]
[INFO] rocketmq-client 4.7.1 .............................. SUCCESS [ 11.852 s]
[INFO] rocketmq-store 4.7.1 ............................... SUCCESS [  8.760 s]
[INFO] rocketmq-srvutil 4.7.1 ............................. SUCCESS [  0.855 s]
[INFO] rocketmq-filter 4.7.1 .............................. SUCCESS [  6.202 s]
[INFO] rocketmq-acl 4.7.1 ................................. SUCCESS [  7.349 s]
[INFO] rocketmq-broker 4.7.1 .............................. SUCCESS [  2.162 s]
[INFO] rocketmq-tools 4.7.1 ............................... SUCCESS [  1.289 s]
[INFO] rocketmq-namesrv 4.7.1 ............................. SUCCESS [  0.628 s]
[INFO] rocketmq-logappender 4.7.1 ......................... SUCCESS [  3.754 s]
[INFO] rocketmq-openmessaging 4.7.1 ....................... SUCCESS [ 26.613 s]
[INFO] rocketmq-example 4.7.1 ............................. SUCCESS [  0.729 s]
[INFO] rocketmq-test 4.7.1 ................................ SUCCESS [ 14.090 s]
[INFO] rocketmq-distribution 4.7.1 ........................ SUCCESS [01:20 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:08 min
[INFO] Finished at: 2020-11-30T10:40:29+08:00
[INFO] ------------------------------------------------------------------------

然后在distribution/target/目录下,可以看到编译打包好的可分发包:

[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# ls distribution/target/
archive-tmp  checkstyle-cachefile  checkstyle-checker.xml  checkstyle-result.xml  maven-shared-archive-resources  rocketmq-4.7.1  rocketmq-4.7.1.tar.gz  rocketmq-4.7.1.zip
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# 

将编译打包好的可分发包解压到合适的目录下,并进入解压后的目录:

[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# tar -zxvf distribution/target/rocketmq-4.7.1.tar.gz -C /usr/local
[root@rocketmq01 /usr/local/src/rocketmq-all-4.7.1-source-release]# cd /usr/local/rocketmq-4.7.1/
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# ls
benchmark  bin  conf  lib  LICENSE  NOTICE  README.md
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# 

创建数据存储目录:

[root@rocketmq01 /usr/local/rocketmq-4.7.1]# mkdir -p store/commitlog store/consumequeue store/index
  • commitlog:生产者投递到rocketmq的数据所存储的目录
  • consumequeue:存储offset数据,用于对commitlog的数据进行索引
  • index:随机读时用到的索引文件

修改日志配置文件:

[root@rocketmq01 /usr/local/rocketmq-4.7.1]# mkdir logs
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# sed -i 's#${user.home}#/usr/local/rocketmq-4.7.1#g' conf/*.xml

如果机器内存比较小(小于8G),就需要根据实际情况修改下启动脚本的JVM参数,但不能低于1G:

[root@rocketmq01 /usr/local/rocketmq-4.7.1]# vim bin/runbroker.sh
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g -Xmn2g"
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# vim bin/runserver.sh
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g -Xmn2g -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=320m"

然后将RocketMQ的安装目录分发到其他机器上:

[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1 rocketmq02:/usr/local/rocketmq-4.7.1
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1 rocketmq03:/usr/local/rocketmq-4.7.1
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1 rocketmq04:/usr/local/rocketmq-4.7.1

配置环境变量,所有机器都需要配置:

[root@rocketmq01 /usr/local/rocketmq-4.7.1]# vim /etc/profile
export ROCKETMQ_HOME=/usr/local/rocketmq-4.7.1
export PATH=$PATH:$ROCKETMQ_HOME/bin
[root@rocketmq01 /usr/local/rocketmq-4.7.1]# source /etc/profile

部署双主双从模式集群

准备配置文件

配置项相关的官方文档:

  • https://github.com/apache/rocketmq/blob/master/docs/cn/best_practice.md

在四台机器上都安装好RocketMQ之后,我们就可以开始部署RocketMQ的双主双从模式集群了,其实部署起来也挺简单,就是准备好相应的配置文件即可。这里依旧以rocketmq01节点为例,首先,清空如下文件的内容:

[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a.properties
[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b.properties

然后编辑broker-a.properties文件的内容如下:

#所属的集群名称
brokerClusterName=rocketmq-cluster
#broker 的名称,注意此处不同的Master配置文件填写的不一样
brokerName=broker-a
#0 表示 Master,>0 表示 Slave
brokerId=0
#nameServer 地址,分号分割
namesrvAddr=rocketmq-nameserver1:9876;rocketmq-nameserver2:9876;rocketmq-nameserver3:9876;rocketmq-nameserver4:9876
#在发送消息时,自动创建服务器不存在的 topic,默认创建的队列数
defaultTopicQueueNums=4
#是否允许 Broker 自动创建 Topic,建议线下开启,线上关闭
autoCreateTopicEnable=true
#是否允许 Broker 自动创建订阅组,建议线下开启,线上关闭
autoCreateSubscriptionGroup=true
#Broker 对外服务的监听端口
listenPort=10911
#删除文件时间点,默认凌晨 4 点
deleteWhen=04
#文件保留时间,默认 48 小时
fileReservedTime=120
#commitLog 每个文件的大小默认 1G
mapedFileSizeCommitLog=1073741824
#ConsumeQueue 每个文件默认存 30W 条,根据业务情况调整
mapedFileSizeConsumeQueue=300000
#destroyMapedFileIntervalForcibly=120000
#redeleteHangedFileInterval=120000
#检测物理文件磁盘空间
diskMaxUsedSpaceRatio=88
#存储路径
storePathRootDir=/usr/local/rocketmq-4.7.1/store
#commitLog 存储路径
storePathCommitLog=/usr/local/rocketmq-4.7.1/store/commitlog
#消费队列存储路径存储路径
storePathConsumeQueue=/usr/local/rocketmq-4.7.1/store/consumequeue
#消息索引存储路径
storePathIndex=/usr/local/rocketmq-4.7.1/store/index
#checkpoint 文件存储路径
storeCheckpoint=/usr/local/rocketmq-4.7.1/store/checkpoint
#abort 文件存储路径
abortFile=/usr/local/rocketmq-4.7.1/store/abort
#限制的消息大小
maxMessageSize=65536
#flushCommitLogLeastPages=4
#flushConsumeQueueLeastPages=2
#flushCommitLogThoroughInterval=10000
#flushConsumeQueueThoroughInterval=60000
#Broker 的角色
#- ASYNC_MASTER 异步复制 Master
#- SYNC_MASTER 同步双写 Master
#- SLAVE
brokerRole=SYNC_MASTER
#刷盘方式
#- ASYNC_FLUSH 异步刷盘
#- SYNC_FLUSH 同步刷盘
flushDiskType=ASYNC_FLUSH
#checkTransactionMessageEnable=false
#发消息线程池数量
#sendMessageThreadPoolNums=128
#拉消息线程池数量
#pullMessageThreadPoolNums=128

broker-b.properties文件的内容与broker-a.properties文件的内容基本一致,就brokerName需要改一下:

brokerName=broker-b

以上是master节点的配置,接着我们来完成slave节点的配置,清空如下文件的内容:

[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a-s.properties
[root@rocketmq01 ~]# echo "" > /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b-s.properties

编辑broker-a-s.properties文件的内容如下:

#所属的集群名称
brokerClusterName=rocketmq-cluster
#broker 的名称,Slave 与 Master 是通过 brokerName 来配对的
brokerName=broker-a
#0 表示 Master,>0 表示 Slave
brokerId=1
#nameServer 地址,分号分割
namesrvAddr=rocketmq-nameserver1:9876;rocketmq-nameserver2:9876;rocketmq-nameserver3:9876;rocketmq-nameserver4:9876
#在发送消息时,自动创建服务器不存在的 topic,默认创建的队列数
defaultTopicQueueNums=4
#是否允许 Broker 自动创建 Topic,建议线下开启,线上关闭
autoCreateTopicEnable=true
#是否允许 Broker 自动创建订阅组,建议线下开启,线上关闭
autoCreateSubscriptionGroup=true
#Broker 对外服务的监听端口
listenPort=10911
#删除文件时间点,默认凌晨 4 点
deleteWhen=04
#文件保留时间,默认 48 小时
fileReservedTime=120
#commitLog 每个文件的大小默认 1G
mapedFileSizeCommitLog=1073741824
#ConsumeQueue 每个文件默认存 30W 条,根据业务情况调整
mapedFileSizeConsumeQueue=300000
#destroyMapedFileIntervalForcibly=120000
#redeleteHangedFileInterval=120000
#检测物理文件磁盘空间
diskMaxUsedSpaceRatio=88
#存储路径
storePathRootDir=/usr/local/rocketmq-4.7.1/store
#commitLog 存储路径
storePathCommitLog=/usr/local/rocketmq-4.7.1/store/commitlog
#消费队列存储路径存储路径
storePathConsumeQueue=/usr/local/rocketmq-4.7.1/store/consumequeue
#消息索引存储路径
storePathIndex=/usr/local/rocketmq-4.7.1/store/index
#checkpoint 文件存储路径
storeCheckpoint=/usr/local/rocketmq-4.7.1/store/checkpoint
#abort 文件存储路径
abortFile=/usr/local/rocketmq-4.7.1/store/abort
#限制的消息大小
maxMessageSize=65536
#flushCommitLogLeastPages=4
#flushConsumeQueueLeastPages=2
#flushCommitLogThoroughInterval=10000
#flushConsumeQueueThoroughInterval=60000
#Broker 的角色
#- ASYNC_MASTER 异步复制 Master
#- SYNC_MASTER 同步双写 Master
#- SLAVE
brokerRole=SLAVE
#刷盘方式
#- ASYNC_FLUSH 异步刷盘
#- SYNC_FLUSH 同步刷盘
flushDiskType=ASYNC_FLUSH
#checkTransactionMessageEnable=false
#发消息线程池数量
#sendMessageThreadPoolNums=128
#拉消息线程池数量
#pullMessageThreadPoolNums=128

同样的,broker-b-s.properties文件的内容与broker-a-s.properties文件的内容基本一致,就brokerName需要改一下:

brokerName=broker-b

准备好配置文件后,将这几个配置文件所在的目录分发给其他节点:

[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/ rocketmq02:/usr/local/rocketmq-4.7.1/conf
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/ rocketmq03:/usr/local/rocketmq-4.7.1/conf
[root@rocketmq01 ~]# scp -r /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/ rocketmq04:/usr/local/rocketmq-4.7.1/conf

启动集群

完成配置文件的分发,并且确认无误后,就可以启动我们的集群了。首先,在四台机器上执行如下命令启动NameServer:

$ nohup sh mqnamesrv &

在rocketmq01上启动Master Broker:

[root@rocketmq01 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a.properties >/dev/null 2>&1 &

在rocketmq02上启动Master Broker:

[root@rocketmq02 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b.properties >/dev/null 2>&1 &

在rocketmq03上启动Slave Broker:

[root@rocketmq03 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-a-s.properties >/dev/null 2>&1 &

在rocketmq04上启动Slave Broker:

[root@rocketmq04 ~]# nohup sh mqbroker -c /usr/local/rocketmq-4.7.1/conf/2m-2s-sync/broker-b-s.properties >/dev/null 2>&1 &

检查NameServer和Broker的进程及监听端口是否正常:

[root@rocketmq01 ~]# jps
[root@rocketmq01 ~]# netstat -ntlp |grep java

通过如下命令可以查看NameServer和Broker的日志:

[root@rocketmq01 ~]# tail -f -n 500 /usr/local/rocketmq-4.7.1/logs/rocketmqlogs/broker.log
[root@rocketmq01 ~]# tail -f -n 500 /usr/local/rocketmq-4.7.1/logs/rocketmqlogs/namesrv.log

与NameServer正常通信的情况下broker.log会有如下心跳日志,代表节点之间通信正常:

2020-12-02 11:37:38 INFO brokerOutApi_thread_1 - register broker[0]to name server rocketmq-nameserver1:9876 OK
2020-12-02 11:37:38 INFO brokerOutApi_thread_3 - register broker[0]to name server rocketmq-nameserver2:9876 OK
2020-12-02 11:37:38 INFO brokerOutApi_thread_2 - register broker[0]to name server rocketmq-nameserver3:9876 OK
2020-12-02 11:37:38 INFO brokerOutApi_thread_4 - register broker[0]to name server rocketmq-nameserver4:9876 OK

搭建RocketMQ管控台

接下来我们在任意一个节点上搭建一个RocketMQ的管控台,RocketMQ官方提供了一个基于Spring Boot开发的可视化控制台,可以方便我们查看RocketMQ的运行情况以及提升运维效率。RocketMQ在如下仓库提供了一些扩展组件,我们要使用到的控制台就包含在内:

  • https://github.com/apache/rocketmq-externals/tree/master/

RocketMQ控制台是使用Spring Boot编写的,我们需要将源码克隆下载并修改相关配置即可使用:

[root@rocketmq01 /usr/local/src]# git clone https://github.com/apache/rocketmq-externals.git

修改rocketmq-console项目中的application.properties配置文件,我这里主要是修改了监听端口和Name Server的连接地址,至于其他配置项有需要的话可按照说明自行修改:

[root@rocketmq01 /usr/local/src]# cd rocketmq-externals/rocketmq-console/
[root@rocketmq01 /usr/local/src/rocketmq-externals/rocketmq-console]# vim src/main/resources/application.properties
# console的监听端口,默认是8080
server.port=8999
# Name Server的连接地址;非必须,可以在启动了console后,在控制台导航栏 - 运维 - NameSvrAddrList一栏设置
rocketmq.config.namesrvAddr=rocketmq-nameserver1:9876;rocketmq-nameserver2:9876;rocketmq-nameserver3:9876;rocketmq-nameserver4:9876

然后执行如下命令进行编译打包:

[root@rocketmq01 /usr/local/src/rocketmq-externals/rocketmq-console]# mvn clean package -DskipTests
...

[INFO] Replacing main artifact with repackaged archive
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  04:52 min
[INFO] Finished at: 2020-11-30T14:26:20+08:00
[INFO] ------------------------------------------------------------------------

打包完成后就可以启动RocketMQ控制台了:

[root@rocketmq01 /usr/local/src/rocketmq-externals/rocketmq-console]# java -jar target/rocketmq-console-ng-2.0.0.jar

RocketMQ控制台在运行的过程中,可能会输出如下错误:

ERROR Exception caught: mqAdminExt get broker stats data TOPIC_PUT_NUMS failed

但是这个错误并不影响正常运行,具体原因可以参考如下文章的说明:

  • https://juejin.cn/post/6870762347166728200

使用浏览器访问控制台,正常的情况下能看到如下界面:

在“Cluster”页面可以查看集群中各个节点的信息代表我们的集群已经构建成功:


停止集群

停止集群的方式和停止单个节点一样,首先,关闭所有的 BrokerServer:

$ sh mqshutdown broker
  • **Tips:**由于需要在多个节点上执行,节点多了的话也比较麻烦,可以自己尝试写个脚本来实现集群的一键启停

然后再关闭所有的NameServer:

$ sh mqshutdown namesrv