- enable.auto.commit 默认值true,表示消费者会周期性自动提交消费的offset
- auto.commit.interval.ms 在enable.auto.commit 为true的情况下,自动提交的间隔,默认值5000ms
- max.poll.records 单次poll,消费者拉取的最大数据条数,默认值500
- max.poll.interval.ms 默认值5分钟,表示若5分钟之内消费者没有消费完上一次poll的消息,那么consumer会主动发起离开g
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "30000");
props.put("max.poll.records", 20);
@KafkaListener(topics = "test1", groupId = "group1")
public void fromKafka(ConsumerRecord record) throws InterruptedException {
System.out.println(new Date().toString()+"group111 "+record.toString()); Thread.sleep(1000);
@Scheduled(fixedRate = 5000)
public void schedule() throws TimeoutException {
Map<TopicPartition, OffsetAndMetadata> offset1 = lagOf("group1","localhost:9092"); for (Map.Entry<TopicPartition, OffsetAndMetadata> entry:offset1.entrySet()){
System.out.println(new Date().toString() +"consumer group1:topic-"+entry.getKey().topic()+"partition-"+entry.getKey().partition()+" offset"+entry.getValue().offset());
public static Map<TopicPartition, OffsetAndMetadata> lagOf(String groupID, String bootstrapServers) throws TimeoutException {
Properties props = new Properties();
props.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
try (AdminClient client = AdminClient.create(props)) {
ListConsumerGroupOffsetsResult result = client.listConsumerGroupOffsets(groupID); try {
Map<TopicPartition, OffsetAndMetadata> consumedOffsets = result.partitionsToOffsetAndMetadata().get(10, TimeUnit.SECONDS);
return consumedOffsets;
} catch (Exception e){
return Collections.emptyMap();
} } }
对topic test1配置了消费者组group1,单次拉取消息数20条,消费者组group1每条消息耗费1s,记录日志打印结果如下:
Mon Feb 10 11:54:39 CST 2020consumer group1:topic-test2partition-0 offset120
Mon Feb 10 11:54:39 CST 2020group111 ConsumerRecord(topic = test1, partition = 0, offset = 125
Mon Feb 10 11:55:19 CST 2020consumer group1:topic-test2partition-0 offset160
Mon Feb 10 11:55:19 CST 2020group111 ConsumerRecord(topic = test1, partition = 0, offset = 165 ...
Mon Feb 10 11:55:59 CST 2020consumer group1:topic-test2partition-0 offset200
Mon Feb 10 11:55:59 CST 2020group111 ConsumerRecord(topic = test1, partition = 0, offset = 205
在enable.auto.commit 默认值true情况下,出现重复消费的场景有以下几种:
3.1 consumer 在消费过程中,应用进程被强制kill掉或发生异常退出。
3.2 消费者消费时间过长
max.poll.interval.ms参数定义了两次poll的最大间隔,它的默认值是 5 分钟,表示你的 Consumer 程序如果在 5 分钟之内无法消费完 poll 方法返回的消息,那么 Consumer 会主动发起“离开组”的请求,Coordinator 也会开启新一轮 Rebalance。
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "5000");
//单次poll拉取11条消息 props.put("max.poll.records", 11);
@KafkaListener(topics = "test2", groupId = "group22")
public void fromKafka1(ConsumerRecord record) {
System.out.println(new Date().toString() +": group222 "+record.toString());
try {
} catch (InterruptedException e) {
e.printStackTrace(); } }
单次拉取11条消息,每条消息耗时30s,11条消息耗时5分钟30秒,由于max.poll.interval.ms 默认值5分钟,所以消费者无法在5分钟内消费完,consumer会离开组,导致rebalance。在消费完11条消息后,consumer会重新连接broker,再次rebalance,因为上次消费的offset未提交,再次拉取的消息是之前消费过的消息,造成重复消费。
Tue Feb 11 17:29:33 CST 2020: group222 ConsumerRecord(topic = test2, partition = 0, offset = 100, CreateTime = 1581306569687, serialized key size = 3, serialized value size = 4, headers = RecordHeaders(headers = [], isReadOnly = false), key = 100, value = abcd)
Tue Feb 11 17:30:03 CST 2020: group222 ConsumerRecord(topic = test2, partition = 0, offset = 101, CreateTime = 1581306569687, serialized key size = 3, serialized value size = 4, headers = RecordHeaders(headers = [], isReadOnly = false), key = 101, value = abcd)
Tue Feb 11 17:34:33 CST 2020: group222 ConsumerRecord(topic = test2, partition = 0, offset = 110, CreateTime = 1581306569688, serialized key size = 3, serialized value size = 4, headers = RecordHeaders(headers = [], isReadOnly = false), key = 110, value = abcd)
2020-02-11 17:35:03.513 WARN 53544 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-2, groupId=group22] Synchronous auto-commit of offsets {test2-0=OffsetAndMetadata{offset=111, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
2020-02-11 17:35:03.513 INFO 53544 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-2, groupId=group22] Revoking previously assigned partitions [test2-0]
2020-02-11 17:35:03.513 INFO 53544 --- [ntainer#0-0-C-1] o.s.k.l.KafkaMessageListenerContainer : partitions revoked: [test2-0]
2020-02-11 17:35:03.513 INFO 53544 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-2, groupId=group22] (Re-)joining group
2020-02-11 17:35:03.521 INFO 53544 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-2, groupId=group22] Successfully joined group with generation 57
2020-02-11 17:35:03.522 INFO 53544 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-2, groupId=group22] Setting newly assigned partitions [test2-0]
2020-02-11 17:35:03.627 INFO 53544 --- [ntainer#0-0-C-1] o.s.k.l.KafkaMessageListenerContainer : partitions assigned: [test2-0]
Tue Feb 11 17:35:03 CST 2020: group222 ConsumerRecord(topic = test2, partition = 0, offset = 100, CreateTime = 1581306569687, serialized key size = 3, serialized value size = 4, headers = RecordHeaders(headers = [], isReadOnly = false), key = 100, value = abcd)
3、 Kafka中位移提交那些事儿