1. Kafka
Kafka是最初由Linkedin公司开发,是一个分布式、支持分区的(partition)、多副本的(replica),基于zookeeper协调的分布式消息系统。
- Java和Scala编程语言编写
- 消费模式:pull
- 高吞吐量、低延迟:kafka每秒可以处理几十万条消息,它的延迟最低只有几毫秒
- 可扩展性:kafka集群支持热扩展
- 持久性、可靠性:消息被持久化到本地磁盘(zero-copy机制),并且支持数据备份防止数据丢失
- 高并发:支持数千个客户端同时读写
- 顺序保证:数据处理的顺序都很重要。大部分消息队列本来就是排序的,并且能保证数据会按照特定的顺序来处理。每一个Partition内的消息的有序性
- 配合ZooKeeper分布式,Broker、Producer、Consumer都原生自动支持分布式,自动实现负载均衡;
2. 应用场景
- 异步处理 (将消息存储,异步处理)
- 系统解耦(降低系统之间耦合度)
- 流量削峰(抢购秒杀)
- 日志搜集(大量日志搜集工作)
- …
3. 安装kafka
使用docker安装
zookeeper: wurstmeister/zookeeper kafka集群
kafka: wurstmeister/kafka Broker服务器节点
kafka-manager: sheepkiller/kafka-manager kafka可视化平台
version: '3'
services:
zookeeper1:
image: wurstmeister/zookeeper
container_name: kafka-zookeeper1
hostname: zookeeper1
ports:
- "2181:2181"
networks:
- kafka_net
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=0.0.0.0:2888:3888
kafka1:
image: wurstmeister/kafka
container_name: kafka-kafka1
hostname: kafka1
ports:
- "9092:9092"
depends_on:
- zookeeper1
environment:
KAFKA_BROKER_ID: 1
KAFKA_ADVERTISED_HOST_NAME: 192.168.1.105 #宿主机IP
KAFKA_MESSAGE_MAX_BYTES: 2000000
KAFKA_CREATE_TOPICS: "Topic1:1:3,Topic2:1:1:compact"
KAFKA_ZOOKEEPER_CONNECT: zookeeper1:2181
JVM_XMS: "256M"
JVM_XMX: "512M"
networks:
- kafka_net
kafka-manager:
container_name: kafka-manager
image: sheepkiller/kafka-manager
hostname: kafka-manager
ports:
- "9000:9000"
networks:
- kafka_net
depends_on:
- zookeeper1
environment:
ZK_HOSTS: zookeeper1:2181
APPLICATION_SECRET: letmein
KAFKA_MANAGER_AUTH_ENABLED: "true"
KAFKA_MANAGER_USERNAME: admin
KAFKA_MANAGER_PASSWORD: 123456
restart: always
networks:
kafka_net:
driver: bridge # 生成一个桥接网络,用于容器内部通信,注意实际生成的网络名称会带有docker-compose.yml文件所在文件夹的前缀,比如我的.yml文件放在了hbl文件夹下,所以执行后生成的网络名为hbl_hbl_net
# external: true 如果外部已有网络就用这个配置
安装完后访问 http://localhost:9000/, 如下图Dashboard
4. .NET Core代码示例
Nuget中添加引用Confluent.Kafka
Install-Package Confluent.Kafka -Version 1.3.0
- Kafka.Consume 消费端
using System;
using System.Threading;
using Confluent.Kafka;
namespace Kafka.Consume
{
class Program
{
static void Main(string[] args)
{
try
{
var instane = $"{new Random().Next(10)}";
Console.WriteLine($"Kafka消费者启动... 消费者名称:{instane}");
Consume(instane);
}
catch (Exception ex)
{
Console.WriteLine("Kafka消费者异常:" + ex.ToString());
}
}
static void Consume(string instane)
{
var groupId = $"consumer-group-test";
var conf = new ConsumerConfig
{
GroupId = groupId,
BootstrapServers = "192.168.1.105:9092",
AutoOffsetReset = AutoOffsetReset.Earliest
};
var topic = "topic-test";
using (var c = new ConsumerBuilder<Ignore, string>(conf).Build())
{
c.Subscribe(topic);
Console.WriteLine($"消费者:{instane},GroupID:{groupId}, 订阅主题:[{topic}]成功!");
CancellationTokenSource cts = new CancellationTokenSource();
Console.CancelKeyPress += (_, e) => {
e.Cancel = true; // prevent the process from terminating.
cts.Cancel();
};
try
{
while (true)
{
try
{
var cr = c.Consume(cts.Token);
Console.WriteLine($"消费者: {instane}, 消费一条消息: '{cr.Value}' at: '{cr.TopicPartitionOffset}'.");
}
catch (ConsumeException e)
{
Console.WriteLine($"Error occured: {e.Error.Reason}");
}
}
}
catch (OperationCanceledException)
{
c.Close();
}
}
}
}
}
- Kafka.Produce 生产者
using System;
using System.Threading;
using System.Threading.Tasks;
using Confluent.Kafka;
namespace Kafka.Produce
{
class Program
{
public static void Main(string[] args)
{
var topic = "topic-test";
var conf = new ProducerConfig { BootstrapServers = "192.168.1.105:9092" };
Action<DeliveryReport<Null, string>> handler = r =>
Console.WriteLine(!r.Error.IsError
? $"Delivered message to {r.TopicPartitionOffset}"
: $"Delivery Error: {r.Error.Reason}");
using (var p = new ProducerBuilder<Null, string>(conf).Build())
{
for (int i = 0; i < 100000; ++i)
{
var message = "消息:" + i.ToString();
p.Produce(topic, new Message<Null, string> { Value = message }, handler);
Console.WriteLine($"生产者,主题:{topic}, 一条消息: {message}");
Thread.Sleep(1000);
}
// wait for up to 10 seconds for any inflight messages to be delivered.
p.Flush(TimeSpan.FromSeconds(10));
}
}
}
}
- 终端开启两个消费者,一个生产者,如下消费者-生产者-Borker关系图如下:
- 程序启动后
5. 总结
- 建议Consume数量<=Partition数量,否则有些Consume将无法消费消息,这样会导致服务器资源浪费
- Consume不设置GroupId, Kafka内部会默认分配一个,
a. 不同GroupId的2个Consume启动,如果订阅相同的topic,则会同时消费,这样可以做到同一个消息可以做不同事情,比如Consume1统计分析,Consume2异步处理日志,这个也是kafka一个比较优秀的点
b. 相同GroupId的2个Consume启动,如果订阅相同的topic,需要设置Consume数量<=Partition数量,否则只有一个Consume可以消费,但是你如果停掉其中一个Consume1,另外一个Consume2则也可以消费