在上一个kafka消费实现中,对数据的处理是消费一条就处理一条。如果数据处理是做http发送或者可以批量执行的时候。这样一条一条处理就很浪费了。所以这里要做修改。将阻塞队列的成员改成list,这样处理线程就可以批量处理了。

结果如图:

spring kafka消费者java spring kafka消费最新数据_List

消费管理者,管理所有的消费线程,数据处理线程,提供消费监听器的启动和停止:

public class OptimizeConsumerManager {

    //消息阻塞队列用于存储消费出来的数据
    private ArrayBlockingQueue<List<String>> queue = new ArrayBlockingQueue<>(500);
    private Map<String, List<OptimizeMessageHandleThread>> messageHandleThreadListMap = new HashMap<>();
    private Map<String, List<MyOptimizeSpringKafkaContainer>> containerMap = new HashMap<>();
    private List<MyOptimizeMessageListener> optimizeMessageListeners = new ArrayList<>();
    //锁对象,使用该对象的wait()在没有数据是线程进入wait状态。
    private Object waitObject = new Object();
    //消费者数量
    private int consumerSize = 2;
    //消息处理线程数量
    private int handleThreadSize = 2;

    private String topic;
    private String groupId;
    private String kafkaAddress;

    public OptimizeConsumerManager(int consumerSize, int handleThreadSize, String topic, String kafkaAddress, String groupId) {
        this.consumerSize = consumerSize > 0 ? consumerSize : 2;
        this.handleThreadSize = handleThreadSize > 0 ? handleThreadSize : 2;
        this.topic = topic;
        this.kafkaAddress = kafkaAddress;
        this.groupId = groupId;
    }

    /**
     * 每个group启动指定个数的消费者和处理线程
     */

    public void startConsumeAndHandle() {
        List<MyOptimizeSpringKafkaContainer> containerList = new ArrayList<>();
        //启动消费者数量
        for (int i = 0; i < consumerSize; i++) {
            //设置kafkaListener 使用 AcknowledgingMessageListener(当为 MANUAL_IMMEDIATE 或 MANUAL)
            MyOptimizeMessageListener myMessageListener = new MyOptimizeMessageListener(groupId + "_" + i, waitObject, queue);
            optimizeMessageListeners.add(myMessageListener);

            MyOptimizeSpringKafkaContainer mySpringKafkaContainer = new MyOptimizeSpringKafkaContainer(myMessageListener);
            mySpringKafkaContainer.initContainer(kafkaAddress, topic, groupId);
            //开始消费数据
            mySpringKafkaContainer.startKafkaListen();
            containerList.add(mySpringKafkaContainer);
        }
        //启动数据处理线程
        List<OptimizeMessageHandleThread> list = new ArrayList<>();
        for (int i = 0; i < handleThreadSize; i++) {
            OptimizeMessageHandleThread messageHandleThread = new OptimizeMessageHandleThread(groupId + "_" + i, waitObject, queue, this);
            list.add(messageHandleThread);
            messageHandleThread.start();
        }
        messageHandleThreadListMap.put(groupId, list);
        containerMap.put(groupId, containerList);
    }

    public void stopConsumer(String groupId) {
        if (CollectionUtils.isNotEmpty(messageHandleThreadListMap.get(groupId))) {
            messageHandleThreadListMap.get(groupId).forEach(item ->{
                System.out.println("终止数据处理线程:" +item.getName());
                item.stopHandle();
                if(!item.isInterrupted()){
                    item.interrupt();
                }
            });
            messageHandleThreadListMap.remove(groupId);
            containerMap.get(groupId).forEach(MyOptimizeSpringKafkaContainer::stopKafkaListen);
            containerMap.remove(groupId);
        }
        queue.clear();
        waitObject = null;
    }

    /**
     * 处理线程在第一次去不到数据,会调用这个方法,
     * 把所有没有达到添加队列数量(upSize)的meesegList 都添加到队列。
     */
    public void addMessageToQueueImmediately(){
        optimizeMessageListeners.forEach(item ->item.addMessageToQueue(true));
    }
}

容器,初始化消费者,消费监听器。消费的启动、停止。 

public class MyOptimizeSpringKafkaContainer {
    /**
     * kafka消费者
     */
    private KafkaMessageListenerContainer<Integer, String> container;

    AcknowledgingMessageListener messageListener;

    public MyOptimizeSpringKafkaContainer(AcknowledgingMessageListener messageListener) {
        this.messageListener = messageListener;
    }

    /**
     * 参数初始化
     */

    public void initContainer(String kafkaAddress, String topic, String groupId) {
        //设置kafka参数
        Map<String, Object> properties = new HashMap<>(10);
        //kafka地址 ip:9092,ip:9092
        properties.put("bootstrap.servers", kafkaAddress);
        //消费者所属的消费组id
        properties.put("group.id", groupId);
        //设置为手动提交
        properties.put("enable.auto.commit", "false");
        //消费最新数据
        properties.put("auto.offset.reset", "latest");
        // 10 * 1024 * 1024,默认1M,设置为10M,解决单条消息过大导致无法消费的问题
        properties.put("fetch.message.max.bytes", "10485760");
        //key的解码类
        properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        //value的解码类
        properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        // 序列化类
        properties.put("serializer.class", "kafka.serializer.StringEncoder");
        properties.put("max.poll.records", "10");
        properties.put("session.timeout.ms", "30000");
        properties.put("request.timeout.ms", "31000");
        properties.put("fetch.max.wait.ms", "1000");

        //创建containerProperties
        ContainerProperties containerProperties = new ContainerProperties(topic);
        //设置手动立即提交,ACkMode不同则需要 不同的 GenericMessageListener
        containerProperties.setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
        //设置kafkaListener 使用 AcknowledgingMessageListener(当AckMode为 MANUAL_IMMEDIATE 或 MANUAL)
        containerProperties.setMessageListener(messageListener);

        //创建consumerFactory
        ConsumerFactory<Integer, String> consumerFactory = new DefaultKafkaConsumerFactory<>(properties);
        //创建container
        container = new KafkaMessageListenerContainer<>(consumerFactory, containerProperties);
        container.setAutoStartup(false);
    }

    /**
     * 开启监听kafka
     */

    public void startKafkaListen() {
        if (container != null) {
            container.start();
        }
    }

    /**
     * 开启监听kafka
     */
    public void stopKafkaListen() {
        if (container != null) {
            container.stop();
        }
    }
}

消费者监听线程,实现数据接收存入队列:

/**
 * 优化项使用  ArrayBlockingQueue<List<String>> queue 先消费一匹数据在处理
 * @author zhongshucneng
 * @since 1.3.0
 */
public class MyOptimizeMessageListener implements AcknowledgingMessageListener<String, String> {

    private String listenerId;
    //消息处理线程的锁
    private Object waitObject;

    //表示所有处理线程有多少个处于waitObject的等待线程池中的wait中
    private static Integer isHandleThreadWait = 0;
    private static Object object= new Object();
    //阻塞队列(线程安全) 同一个group使用一个相同的队列
    private ArrayBlockingQueue<List<String>> queue;

    private List<String> messageList = new ArrayList<>();

    private Map<Integer, Acknowledgment> acknowledgmentMap = new HashMap<>();

    private int upSize = 5;
    //用来标识消费到数据之后是否要提交偏移量
    private boolean sendAcknowledgmen = false;

    public MyOptimizeMessageListener(String listenerId, Object waitObject, ArrayBlockingQueue<List<String>> queue){
        this.listenerId = listenerId;
        this.waitObject = waitObject;
        this.queue = queue;
    }

    @Override
    public void onMessage(ConsumerRecord<String, String> data, Acknowledgment acknowledgment) {
        addMessage(data,acknowledgment);
    }

    private void addMessage(ConsumerRecord data, Acknowledgment acknowledgment){
        System.out.println(String.format("%s消费数据:topic: %s,分区:%s,偏移量: %s,key: %s, value: %s",
                listenerId, data.topic(),data.partition(), data.offset(),data.key(), data.value()));
        try {
            //判断是否需要提交偏移量,如果是true则说明 addMessageToQueue 被其他线程调用过
            if(sendAcknowledgmen){
                sendAcknowledgmen();
                sendAcknowledgmen = false;
            }

            synchronized (messageList) { //必选要加锁 放在这里跟  addMessageToQueue 方法线程同步
                //先放到list中
                messageList.add(data.offset() + "_" + data.value());
                acknowledgmentMap.put(data.partition(), acknowledgment);
            }
            if (isHandleThreadWait > 0) {  //判断是否有线程处于wait状态,有就直接添加到队列中处理
                addMessageToQueue(false);
                sendAcknowledgmen();
                synchronized (object) {
                    if (isHandleThreadWait > 0) { //二次判断是在其他线程已经进入这里的时候不用在设置一次
                        System.out.println(listenerId + "处于等待处理线程数:" + isHandleThreadWait);
                        synchronized (waitObject) {
                            waitObject.notify();
                        }
                    }
                }
            } else {
                //如果没有线程处于等待状态,就等达到要求再加入队列,适合大数据量的时候
                if (messageList.size() > upSize) {
                    addMessageToQueue(false);
                    sendAcknowledgmen();
                }
            }

        } catch (Exception e) {
            System.out.println(String.format("%s消费数据添加到队列异常:topic: %s,分区:%s,偏移量: %s, value: %s",
                    listenerId, data.topic(),data.partition(), data.offset(), data.value()));
        }
    }

    /**
     * 当 messageList 没达到 upSize 时可以调用该方法立即处理数据
     * @param nextSendAcknowledgment 是否修改 sendAcknowledgment 的值,使其一条数据被消费之后
     */
    public void addMessageToQueue(boolean nextSendAcknowledgment){
        synchronized (messageList) {
            if (messageList.size() > 0) {
                sendAcknowledgmen = nextSendAcknowledgment;
                try {
                    boolean flag = queue.offer(messageList.stream().collect(Collectors.toList()), 5, TimeUnit.SECONDS);
                    //判断是否成功加入队列
                    if (!flag) {
                        System.out.println("加入队列失败,队列大小:" + queue.size());
                    } else {
                        messageList = new ArrayList<>();
                    }
                } catch (InterruptedException e) {
                    System.out.println(String.format("%s消费数据添加到队列异常,数据: %s", listenerId, messageList));
                }
            }
        }
    }

    public static Integer waitThreadCountIncrease(){
        synchronized (object) {
            isHandleThreadWait++;
        }
        return isHandleThreadWait;
    }

    public static Integer waitThreadCountSubstract(){
        synchronized (object) {
            isHandleThreadWait--;
        }
        return isHandleThreadWait;
    }

    private void sendAcknowledgmen(){
        //偏移量的提交只能这里, addMessageToQueue是有其他的线程调用,是无法提交偏移量的。
        acknowledgmentMap.entrySet().forEach(item -> {
            System.out.println(String.format("%s提交偏移量,分区:%s", listenerId, item.getKey()));
            item.getValue().acknowledge();
        });
        acknowledgmentMap.clear();
    }
}

数据处理线程,实现数据处理,没有数据时休眠,等待唤醒:

public class OptimizeMessageHandleThread extends Thread {
    private ArrayBlockingQueue<List<String>> queue;

    private String name;

    //所有处理线程的等待对象
    private Object waitObject;

    private OptimizeConsumerManager optimizeConsumerManager;

    private boolean isRunning = false;

    public OptimizeMessageHandleThread(String threadName, Object waitObject,
                                       ArrayBlockingQueue<List<String>> queue, OptimizeConsumerManager optimizeConsumerManager){
        super(threadName);
        this.name = threadName;
        this.waitObject = waitObject;
        this.queue = queue;
        this.optimizeConsumerManager = optimizeConsumerManager;
    }

    @Override
    public void run() {
        System.out.println(name + " 消息处理线程开始");
        handle();
    }

    private void handle(){
        isRunning = true;
        while(isRunning && !isInterrupted()) {
            List<String> result = takeMessage();
            if (CollectionUtils.isEmpty(result)) {
                try {
                    synchronized (waitObject) {
                        System.out.println(String.format("%s线程进入wait状态,目前wait线程数量:%s ", name, MyOptimizeMessageListener.waitThreadCountIncrease()));
                        waitObject.wait(10 * 60 * 1000);
                        System.out.println(String.format("%s线程退出wait状态,目前wait线程数量:%s ", name, MyOptimizeMessageListener.waitThreadCountSubstract()));
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            } else {
                System.out.println(name+ "-- 处理消息:" + result);
            }
        }
        System.out.println(name +"数据处理线程停止");
    }

    private List<String> takeMessage(){
        List<String> result = null;
        try {
            result = queue.poll(10, TimeUnit.SECONDS);
            if(StringUtils.isEmpty(result)){
                //将没有达到数量(upSize)要求的messageList数据发送到对列中
                optimizeConsumerManager.addMessageToQueueImmediately();
                result = queue.poll(10, TimeUnit.SECONDS);
            }
        } catch (InterruptedException e) {
            System.out.println("获取数据异常");
        }
        return result;
    }

    public void stopHandle(){
        isRunning = false;
    }
}

本次优化实现了处理线程对数据的批量处理。并在没有数据时增加了获取未达到要求批量数量的流程。

测试方法:

@Component
public class KafkaMessageHandleTest implements CommandLineRunner, DisposableBean {

    private Map<String, OptimizeConsumerManager> optimizeConsumerManagerMap = new HashMap<>();

    private Map<String, Optimize2ConsumerManager> optimize2ConsumerManagerMap = new HashMap<>();

    String topic = "KAFKA_CONSUME_TEST_MESSAGE_TOPIC";
    String groupId = "KAFKA_CONSUME_TEST_GROUP";
    String kafkaAddress = "ip:9092";

    @Override
    public void run(String... args) throws Exception {
        startConsumer();
    }

    public void startConsumer(){
        int size = 3;

        OptimizeConsumerManager consumerManager = new OptimizeConsumerManager(size, size,topic,kafkaAddress,groupId);
        consumerManager.startConsumeAndHandle();
        optimizeConsumerManagerMap.put(groupId, consumerManager);
    }

    public void stopConsumer(){

        optimizeConsumerManagerMap.get(groupId).stopConsumer(groupId);
        optimizeConsumerManagerMap.remove(groupId);

    }

    @Override
    public void destroy() throws Exception {
        stopConsumer();
    }
}