springcloud kafka 给topic设置分区数量

转载

锦绣前程未央 2024-11-07 19:36:42

文章标签 java kafka spring apache 文章分类 架构后端开发

Spring Cloud Stream + Kafka 消息驱动（自定义Channel支持多Topic）

消息中间件有多种，rabbitmq，rocketmq，activemq，kafka等。不同的消息中间件具体细节不一样。那么有没有一种新的技术诞生，让我们不再关注具体MQ细节，我们只需要用一种适配绑定的方式，自动给我们在各种MQ内切换。

屏蔽底层消息中间件的差异，降低切换成本，统一消息的编程模型。Spring Cloud Stream 因此诞生。应用程序使用inputs或者outputs来与springcloud stream中binder交互。

SpringCloud Stream为一些供应商的消息中间件产品提供了个性化的自动化配置实现，引用发布-订阅、消费组、分区三个核心概念。目前仅支持RabbitMQ和Kafka。

springcloud kafka 给topic设置分区数量_spring

stream标准流程套路

springcloud kafka 给topic设置分区数量_java_02

Binder：很方便的连接中间件，屏蔽差异。

Channel：通道，是队列Queue的一种抽象，在消息通讯系统中就是实现存储和转发的媒介，通过channel对队列进行配置。

Source和Sink：简单理解为消息的生产者和消费者。

编程模型

Spring Cloud Stream提供了一些预定义的注解，用于绑定输入和输出channels，以及如何监听channels。

@EnableBinding触发绑定

@EnableBinding注解可以把一个类转成Spring Cloud Stream应用，@EnableBinding注解本身就包含@Configuration注解，会触发Spring Cloud Stream 基本配置。

@Target({ ElementType.TYPE, ElementType.ANNOTATION_TYPE })
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@Configuration
@Import({ BindingBeansRegistrar.class, BinderFactoryAutoConfiguration.class })
@EnableIntegration
public @interface EnableBinding {

   /**
    * A list of interfaces having methods annotated with {@link Input} and/or
    * {@link Output} to indicate binding targets.
    * @return list of interfaces
    */
   Class<?>[] value() default {};

}

@Input与@Output

一个Spring Cloud Stream应用可以有任意数目的input和output通道，通过@Input和@Output注解在接口中定义。

@StreamListener

将被修饰的方法注册为消息中间件上数据流的事件监听器，注解中属性值对应监听的消息通道名。

Source、Sink和Processor

Spring Cloud Stream提供了三个开箱即用的预定义接口。Source用于有单个输出（outbound）通道的应用；Sink用于有单个输入（inbound）通道的应用；Processor用于单个应用同时包含输入和输出通道的情况。

测试（支持多topic）

安装pom

因为我的cloud版本是Hoxton，所以对应的 stream 版本为2.2.x。引入spring-kafka是为了配置更多kafka的信息。

<!--stream,   kafka版本2.3 -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>1.2.61</version>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-stream-binder-kafka</artifactId>
    <version>2.2.0.RELEASE</version>
</dependency>

配置文件

application.yml配置

spring:
  cloud:
    stream:
      kafka:
        binder:
          brokers: local.kafka.com:9092
          auto-create-topics: true
          auto-add-partitions: true
          min-partition-count: 1
          producer-properties:
            acks: -1
            retries: 1
            batch.size: 16384 # Bytes，即16kB
            linger.ms: 10 # 10ms的延迟
            buffer.memory: 33554432
            key.serializer: org.apache.kafka.common.serialization.StringSerializer
            value.serializer: org.apache.kafka.common.serialization.ByteArraySerializer
          consumer-properties:
            allow.auto.create.topics: true
            auto.commit.interval.ms: 1000 # ms
            key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
            value.deserializer: org.apache.kafka.common.serialization.ByteArrayDeserializer

      # 绑定对应的 信道
      bindings:
        logOutput: # 指定的名字是 Channel 类里面的 output 的 channel 名
          destination: mytest    # 消息发往的 topic
          content-type: application/json    # 消息发送的格式，接收端不用指定格式，但是发送端要
        logInput: # 指定的名字是 Channel 类里面的 input 的 channel 名
          destination: mytest    # 消息接收的目的地
          group: logChannelGroup

Message对象

package com.cloud.test.entity;

import lombok.Data;

import java.io.Serializable;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
@Data
public class Message implements Serializable {
    private static final long serialVersionUID = -2690543989731802965L;
    private String message;
}

自定义2个Topic对应的Channel信道

MyChannel

package com.cloud.test.stream;

import org.springframework.cloud.stream.annotation.Input;
import org.springframework.cloud.stream.annotation.Output;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.SubscribableChannel;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
public interface MyChannel {
    String INPUT = "myInput";
    String OUTPUT = "myOutput";

    @Input(INPUT)
    SubscribableChannel input();

    @Output(OUTPUT)
    MessageChannel output();
}

YourChannel

package com.cloud.test.stream;

import org.springframework.cloud.stream.annotation.Input;
import org.springframework.cloud.stream.annotation.Output;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.SubscribableChannel;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
public interface YourChannel {
    String INPUT = "yourInput";
    String OUTPUT = "yourOutput";

    @Input(INPUT)
    SubscribableChannel input();

    @Output(OUTPUT)
    MessageChannel output();
}

编写对应的消息生产者

MyChannelSend

package com.cloud.test.stream;

import com.alibaba.fastjson.JSON;
import com.cloud.test.entity.Message;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cloud.stream.annotation.EnableBinding;
import org.springframework.messaging.support.MessageBuilder;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
@EnableBinding(MyChannel.class)
public class MyChannelSend {
    @Autowired
    private MyChannel myChannel;

    public void sendMsg(String msg){
        Message message = new Message();
        message.setMessage(msg);

        myChannel.output().send(MessageBuilder.withPayload(JSON.toJSONString(message)).build());
    }
}

YourChannelSend

package com.cloud.test.stream;

import com.alibaba.fastjson.JSON;
import com.cloud.test.entity.Message;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cloud.stream.annotation.EnableBinding;
import org.springframework.messaging.support.MessageBuilder;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
@EnableBinding(YourChannel.class)
public class YourChannelSend {
    @Autowired
    private YourChannel yourChannel;

    public void sendMsg(String msg){
        Message message = new Message();
        message.setMessage(msg);

        yourChannel.output().send(MessageBuilder.withPayload(JSON.toJSONString(message)).build());
    }
}

编写对应的消息消费者

MyChannelReceive

package com.cloud.test.stream;

import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.common.PartitionInfo;
import org.springframework.cloud.stream.annotation.EnableBinding;
import org.springframework.cloud.stream.annotation.StreamListener;
import org.springframework.kafka.support.KafkaHeaders;
import org.springframework.messaging.handler.annotation.Header;

import java.util.List;
import java.util.Map;
import java.util.Set;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
@EnableBinding(MyChannel.class)
public class MyChannelReceive {

    @StreamListener(MyChannel.INPUT)
    public void in(String in, @Header(KafkaHeaders.CONSUMER) Consumer<?, ?> consumer) {
        System.out.println("====================MyChannelReceive start====================");
        System.out.println(in);
        System.out.println(consumer.toString());

        Map<String, List<PartitionInfo>> topics = consumer.listTopics();
        Set<String> keySet = topics.keySet();
        for (String s : keySet) {
            System.out.println("topic = " + s);
        }
        System.out.println("====================MyChannelReceive end====================");
    }
}

YourChannelReceive

package com.cloud.test.stream;

import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.common.PartitionInfo;
import org.springframework.cloud.stream.annotation.EnableBinding;
import org.springframework.cloud.stream.annotation.StreamListener;
import org.springframework.kafka.support.KafkaHeaders;
import org.springframework.messaging.handler.annotation.Header;

import java.util.List;
import java.util.Map;
import java.util.Set;

/**
 * @author zhe.xiao
 * @date 2020-11-30
 * @description
 */
@EnableBinding(YourChannel.class)
public class YourChannelReceive {

    @StreamListener(YourChannel.INPUT)
    public void in(String in, @Header(KafkaHeaders.CONSUMER) Consumer<?, ?> consumer) {
        System.out.println("====================YourChannelReceive start====================");
        System.out.println(in);
        System.out.println(consumer.toString());

        Map<String, List<PartitionInfo>> topics = consumer.listTopics();
        Set<String> keySet = topics.keySet();
        for (String s : keySet) {
            System.out.println("topic = " + s);
        }
        System.out.println("====================YourChannelReceive end====================");
    }
}

执行测试

@SpringBootTest
@RunWith(SpringRunner.class)
public class StreamTest {
    @Autowired
    MyChannelSend myChannelSend;

    @Autowired
    YourChannelSend yourChannelSend;

    @Test
    public void t1b() {
        myChannelSend.sendMsg("myChannelSend xiao :" + LocalDateTime.now());
    }

    @Test
    public void t1c() {
        yourChannelSend.sendMsg("yourChannelSend zhe :" + LocalDateTime.now());
    }
}

分别运行b和c，可以看到打印数据：

springcloud kafka 给topic设置分区数量_apache_03

重复消费问题

重复消费的产生

模拟2个配置文件：

server:
  port: 9900/9901

spring:
  kafka:
    bootstrap-servers: 192.168.33.10:9092
    producer:
      acks: -1
      retries: 1
      batch-size: 16384
      buffer-memory: 33554432
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.ByteArraySerializer
    consumer:
      enable-auto-commit: true
      auto-commit-interval: 1000

  cloud:
    stream:
      binders:
        myKafka: # 我的binders
          type: kafka # 类型

      # 绑定对应的 信道
      bindings:
        myOutput: # 指定的名字是 MyChannel 类里面的 output 的 channel 名
          destination: mytest    # 消息发往的 topic
          content-type: application/json    # 消息发送的格式，接收端不用指定格式，但是发送端要
          binder: myKafka # 指向 binders 里面定义的类型
        myInput: # 指定的名字是 MyChannel 类里面的 input 的 channel 名
          destination: mytest    # 消息接收的目的地

分别运行：

springcloud kafka 给topic设置分区数量_spring_04

发送测试数据 myChannelSend.sendMsg(“myChannelSend xiao :” + LocalDateTime.now());
打印数据会发现同一条数据被消费了2次

打印数据：

9901消费了

springcloud kafka 给topic设置分区数量_kafka_05

9900也消费了

springcloud kafka 给topic设置分区数量_java_06

问题：发送了一个消息到topic，然后多个消费者都对这一个topic里面的数据进行了消费。

重复消费的解决方案

我们把9900和9901两个服务都配置group：

spring.cloud.bindings.myInput.group: group-a ，这样表明所有的消费者都属于同一个组。根据kafka的定义，同一个组内的不同消费者只会消费数据一次（Kafka的配置不展开详解，里面有几种机制来决定由哪个消费者消费）。

spring:
  ....
  cloud:
    stream:
      binders:
        myKafka: # 我的binders
          type: kafka # 类型

      # 绑定对应的 信道
      bindings:
        myOutput: # 指定的名字是 MyChannel 类里面的 output 的 channel 名
          destination: mytest    # 消息发往的 topic
          content-type: application/json    # 消息发送的格式，接收端不用指定格式，但是发送端要
          binder: myKafka # 指向 binders 里面定义的类型
        myInput: # 指定的名字是 MyChannel 类里面的 input 的 channel 名
          destination: mytest    # 消息接收的目的地
          group: group-a # 核心定义group

现在我们重新发送一条数据： myChannelSend.sendMsg(“myChannelSend xiao :” + LocalDateTime.now()); 打印结果为：

9900消费了：

springcloud kafka 给topic设置分区数量_spring_07