在上一个kafka消费实现中,对数据的处理是消费一条就处理一条。如果数据处理是做http发送或者可以批量执行的时候。这样一条一条处理就很浪费了。所以这里要做修改。将阻塞队列的成员改成list,这样处理线程就可以批量处理了。
结果如图:
消费管理者,管理所有的消费线程,数据处理线程,提供消费监听器的启动和停止:
public class OptimizeConsumerManager {
//消息阻塞队列用于存储消费出来的数据
private ArrayBlockingQueue<List<String>> queue = new ArrayBlockingQueue<>(500);
private Map<String, List<OptimizeMessageHandleThread>> messageHandleThreadListMap = new HashMap<>();
private Map<String, List<MyOptimizeSpringKafkaContainer>> containerMap = new HashMap<>();
private List<MyOptimizeMessageListener> optimizeMessageListeners = new ArrayList<>();
//锁对象,使用该对象的wait()在没有数据是线程进入wait状态。
private Object waitObject = new Object();
//消费者数量
private int consumerSize = 2;
//消息处理线程数量
private int handleThreadSize = 2;
private String topic;
private String groupId;
private String kafkaAddress;
public OptimizeConsumerManager(int consumerSize, int handleThreadSize, String topic, String kafkaAddress, String groupId) {
this.consumerSize = consumerSize > 0 ? consumerSize : 2;
this.handleThreadSize = handleThreadSize > 0 ? handleThreadSize : 2;
this.topic = topic;
this.kafkaAddress = kafkaAddress;
this.groupId = groupId;
}
/**
* 每个group启动指定个数的消费者和处理线程
*/
public void startConsumeAndHandle() {
List<MyOptimizeSpringKafkaContainer> containerList = new ArrayList<>();
//启动消费者数量
for (int i = 0; i < consumerSize; i++) {
//设置kafkaListener 使用 AcknowledgingMessageListener(当为 MANUAL_IMMEDIATE 或 MANUAL)
MyOptimizeMessageListener myMessageListener = new MyOptimizeMessageListener(groupId + "_" + i, waitObject, queue);
optimizeMessageListeners.add(myMessageListener);
MyOptimizeSpringKafkaContainer mySpringKafkaContainer = new MyOptimizeSpringKafkaContainer(myMessageListener);
mySpringKafkaContainer.initContainer(kafkaAddress, topic, groupId);
//开始消费数据
mySpringKafkaContainer.startKafkaListen();
containerList.add(mySpringKafkaContainer);
}
//启动数据处理线程
List<OptimizeMessageHandleThread> list = new ArrayList<>();
for (int i = 0; i < handleThreadSize; i++) {
OptimizeMessageHandleThread messageHandleThread = new OptimizeMessageHandleThread(groupId + "_" + i, waitObject, queue, this);
list.add(messageHandleThread);
messageHandleThread.start();
}
messageHandleThreadListMap.put(groupId, list);
containerMap.put(groupId, containerList);
}
public void stopConsumer(String groupId) {
if (CollectionUtils.isNotEmpty(messageHandleThreadListMap.get(groupId))) {
messageHandleThreadListMap.get(groupId).forEach(item ->{
System.out.println("终止数据处理线程:" +item.getName());
item.stopHandle();
if(!item.isInterrupted()){
item.interrupt();
}
});
messageHandleThreadListMap.remove(groupId);
containerMap.get(groupId).forEach(MyOptimizeSpringKafkaContainer::stopKafkaListen);
containerMap.remove(groupId);
}
queue.clear();
waitObject = null;
}
/**
* 处理线程在第一次去不到数据,会调用这个方法,
* 把所有没有达到添加队列数量(upSize)的meesegList 都添加到队列。
*/
public void addMessageToQueueImmediately(){
optimizeMessageListeners.forEach(item ->item.addMessageToQueue(true));
}
}
容器,初始化消费者,消费监听器。消费的启动、停止。
public class MyOptimizeSpringKafkaContainer {
/**
* kafka消费者
*/
private KafkaMessageListenerContainer<Integer, String> container;
AcknowledgingMessageListener messageListener;
public MyOptimizeSpringKafkaContainer(AcknowledgingMessageListener messageListener) {
this.messageListener = messageListener;
}
/**
* 参数初始化
*/
public void initContainer(String kafkaAddress, String topic, String groupId) {
//设置kafka参数
Map<String, Object> properties = new HashMap<>(10);
//kafka地址 ip:9092,ip:9092
properties.put("bootstrap.servers", kafkaAddress);
//消费者所属的消费组id
properties.put("group.id", groupId);
//设置为手动提交
properties.put("enable.auto.commit", "false");
//消费最新数据
properties.put("auto.offset.reset", "latest");
// 10 * 1024 * 1024,默认1M,设置为10M,解决单条消息过大导致无法消费的问题
properties.put("fetch.message.max.bytes", "10485760");
//key的解码类
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
//value的解码类
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
// 序列化类
properties.put("serializer.class", "kafka.serializer.StringEncoder");
properties.put("max.poll.records", "10");
properties.put("session.timeout.ms", "30000");
properties.put("request.timeout.ms", "31000");
properties.put("fetch.max.wait.ms", "1000");
//创建containerProperties
ContainerProperties containerProperties = new ContainerProperties(topic);
//设置手动立即提交,ACkMode不同则需要 不同的 GenericMessageListener
containerProperties.setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
//设置kafkaListener 使用 AcknowledgingMessageListener(当AckMode为 MANUAL_IMMEDIATE 或 MANUAL)
containerProperties.setMessageListener(messageListener);
//创建consumerFactory
ConsumerFactory<Integer, String> consumerFactory = new DefaultKafkaConsumerFactory<>(properties);
//创建container
container = new KafkaMessageListenerContainer<>(consumerFactory, containerProperties);
container.setAutoStartup(false);
}
/**
* 开启监听kafka
*/
public void startKafkaListen() {
if (container != null) {
container.start();
}
}
/**
* 开启监听kafka
*/
public void stopKafkaListen() {
if (container != null) {
container.stop();
}
}
}
消费者监听线程,实现数据接收存入队列:
/**
* 优化项使用 ArrayBlockingQueue<List<String>> queue 先消费一匹数据在处理
* @author zhongshucneng
* @since 1.3.0
*/
public class MyOptimizeMessageListener implements AcknowledgingMessageListener<String, String> {
private String listenerId;
//消息处理线程的锁
private Object waitObject;
//表示所有处理线程有多少个处于waitObject的等待线程池中的wait中
private static Integer isHandleThreadWait = 0;
private static Object object= new Object();
//阻塞队列(线程安全) 同一个group使用一个相同的队列
private ArrayBlockingQueue<List<String>> queue;
private List<String> messageList = new ArrayList<>();
private Map<Integer, Acknowledgment> acknowledgmentMap = new HashMap<>();
private int upSize = 5;
//用来标识消费到数据之后是否要提交偏移量
private boolean sendAcknowledgmen = false;
public MyOptimizeMessageListener(String listenerId, Object waitObject, ArrayBlockingQueue<List<String>> queue){
this.listenerId = listenerId;
this.waitObject = waitObject;
this.queue = queue;
}
@Override
public void onMessage(ConsumerRecord<String, String> data, Acknowledgment acknowledgment) {
addMessage(data,acknowledgment);
}
private void addMessage(ConsumerRecord data, Acknowledgment acknowledgment){
System.out.println(String.format("%s消费数据:topic: %s,分区:%s,偏移量: %s,key: %s, value: %s",
listenerId, data.topic(),data.partition(), data.offset(),data.key(), data.value()));
try {
//判断是否需要提交偏移量,如果是true则说明 addMessageToQueue 被其他线程调用过
if(sendAcknowledgmen){
sendAcknowledgmen();
sendAcknowledgmen = false;
}
synchronized (messageList) { //必选要加锁 放在这里跟 addMessageToQueue 方法线程同步
//先放到list中
messageList.add(data.offset() + "_" + data.value());
acknowledgmentMap.put(data.partition(), acknowledgment);
}
if (isHandleThreadWait > 0) { //判断是否有线程处于wait状态,有就直接添加到队列中处理
addMessageToQueue(false);
sendAcknowledgmen();
synchronized (object) {
if (isHandleThreadWait > 0) { //二次判断是在其他线程已经进入这里的时候不用在设置一次
System.out.println(listenerId + "处于等待处理线程数:" + isHandleThreadWait);
synchronized (waitObject) {
waitObject.notify();
}
}
}
} else {
//如果没有线程处于等待状态,就等达到要求再加入队列,适合大数据量的时候
if (messageList.size() > upSize) {
addMessageToQueue(false);
sendAcknowledgmen();
}
}
} catch (Exception e) {
System.out.println(String.format("%s消费数据添加到队列异常:topic: %s,分区:%s,偏移量: %s, value: %s",
listenerId, data.topic(),data.partition(), data.offset(), data.value()));
}
}
/**
* 当 messageList 没达到 upSize 时可以调用该方法立即处理数据
* @param nextSendAcknowledgment 是否修改 sendAcknowledgment 的值,使其一条数据被消费之后
*/
public void addMessageToQueue(boolean nextSendAcknowledgment){
synchronized (messageList) {
if (messageList.size() > 0) {
sendAcknowledgmen = nextSendAcknowledgment;
try {
boolean flag = queue.offer(messageList.stream().collect(Collectors.toList()), 5, TimeUnit.SECONDS);
//判断是否成功加入队列
if (!flag) {
System.out.println("加入队列失败,队列大小:" + queue.size());
} else {
messageList = new ArrayList<>();
}
} catch (InterruptedException e) {
System.out.println(String.format("%s消费数据添加到队列异常,数据: %s", listenerId, messageList));
}
}
}
}
public static Integer waitThreadCountIncrease(){
synchronized (object) {
isHandleThreadWait++;
}
return isHandleThreadWait;
}
public static Integer waitThreadCountSubstract(){
synchronized (object) {
isHandleThreadWait--;
}
return isHandleThreadWait;
}
private void sendAcknowledgmen(){
//偏移量的提交只能这里, addMessageToQueue是有其他的线程调用,是无法提交偏移量的。
acknowledgmentMap.entrySet().forEach(item -> {
System.out.println(String.format("%s提交偏移量,分区:%s", listenerId, item.getKey()));
item.getValue().acknowledge();
});
acknowledgmentMap.clear();
}
}
数据处理线程,实现数据处理,没有数据时休眠,等待唤醒:
public class OptimizeMessageHandleThread extends Thread {
private ArrayBlockingQueue<List<String>> queue;
private String name;
//所有处理线程的等待对象
private Object waitObject;
private OptimizeConsumerManager optimizeConsumerManager;
private boolean isRunning = false;
public OptimizeMessageHandleThread(String threadName, Object waitObject,
ArrayBlockingQueue<List<String>> queue, OptimizeConsumerManager optimizeConsumerManager){
super(threadName);
this.name = threadName;
this.waitObject = waitObject;
this.queue = queue;
this.optimizeConsumerManager = optimizeConsumerManager;
}
@Override
public void run() {
System.out.println(name + " 消息处理线程开始");
handle();
}
private void handle(){
isRunning = true;
while(isRunning && !isInterrupted()) {
List<String> result = takeMessage();
if (CollectionUtils.isEmpty(result)) {
try {
synchronized (waitObject) {
System.out.println(String.format("%s线程进入wait状态,目前wait线程数量:%s ", name, MyOptimizeMessageListener.waitThreadCountIncrease()));
waitObject.wait(10 * 60 * 1000);
System.out.println(String.format("%s线程退出wait状态,目前wait线程数量:%s ", name, MyOptimizeMessageListener.waitThreadCountSubstract()));
}
} catch (InterruptedException e) {
e.printStackTrace();
}
} else {
System.out.println(name+ "-- 处理消息:" + result);
}
}
System.out.println(name +"数据处理线程停止");
}
private List<String> takeMessage(){
List<String> result = null;
try {
result = queue.poll(10, TimeUnit.SECONDS);
if(StringUtils.isEmpty(result)){
//将没有达到数量(upSize)要求的messageList数据发送到对列中
optimizeConsumerManager.addMessageToQueueImmediately();
result = queue.poll(10, TimeUnit.SECONDS);
}
} catch (InterruptedException e) {
System.out.println("获取数据异常");
}
return result;
}
public void stopHandle(){
isRunning = false;
}
}
本次优化实现了处理线程对数据的批量处理。并在没有数据时增加了获取未达到要求批量数量的流程。
测试方法:
@Component
public class KafkaMessageHandleTest implements CommandLineRunner, DisposableBean {
private Map<String, OptimizeConsumerManager> optimizeConsumerManagerMap = new HashMap<>();
private Map<String, Optimize2ConsumerManager> optimize2ConsumerManagerMap = new HashMap<>();
String topic = "KAFKA_CONSUME_TEST_MESSAGE_TOPIC";
String groupId = "KAFKA_CONSUME_TEST_GROUP";
String kafkaAddress = "ip:9092";
@Override
public void run(String... args) throws Exception {
startConsumer();
}
public void startConsumer(){
int size = 3;
OptimizeConsumerManager consumerManager = new OptimizeConsumerManager(size, size,topic,kafkaAddress,groupId);
consumerManager.startConsumeAndHandle();
optimizeConsumerManagerMap.put(groupId, consumerManager);
}
public void stopConsumer(){
optimizeConsumerManagerMap.get(groupId).stopConsumer(groupId);
optimizeConsumerManagerMap.remove(groupId);
}
@Override
public void destroy() throws Exception {
stopConsumer();
}
}