5.2 Kafka Streams API之统计单词
原创
©著作权归作者所有:来自51CTO博客作者wx63560c7d74933的原创作品,请联系作者获取转载授权,否则将追究法律责任
1.代码示例
(1).添加依赖
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>2.3.0</version>
</dependency>
(2).代码
public class StreamSample {
private static final String TOPIC_INPUT = "steven-stream-in";
private static final String TOPIC_OUT = "steven-stream-out";
public static void main(String[] args) {
//Stream配置
Properties props = new Properties();
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordCount");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
//构建流结构拓扑
final StreamsBuilder builder = new StreamsBuilder();
//构建流计算过程
wordCountStream(builder);
final KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();
}
/**
* 构建流计算过程
*
* @param builder
*/
static void wordCountStream(final StreamsBuilder builder) {
//KStream不断从TOPIC_INPUT上获取新数据,并且追加到流上的一个抽象对象
KStream<String, String> source = builder.stream(TOPIC_INPUT);
/*
* KTable是数据集合的抽象对象
* flatMapValues:将一行数据拆分为多行数据,如key为1,value为Hello World,flatMapValues将拆分为key 1,value Hello和key 1,value World两条记录
* groupBy:按value值合并
* count:统计出现的总数
*/
final KTable<String, Long> count = source.flatMapValues(value -> Arrays.asList(value.toLowerCase(Locale.getDefault()).split(" ")))
.groupBy((key, value) -> value)
.count();
//将结果输入到TOPIC_OUT中
count.toStream().to(TOPIC_OUT, Produced.with(Serdes.String(), Serdes.Long()));
}
}
2.代码运行结果
(1).创建生产者
打开一个cmd终端,在E:\Kafka\kafka_2.12-1.1.0\bin\windows目录下执行kafka-console-producer.bat --broker-list localhost:9092 --topic steven-stream-in命令,创建生产者。
kafka-console-producer.bat --broker-list localhost:9092 --topic steven-stream-in

(2).创建消费者
打开一个cmd终端,在E:\Kafka\kafka_2.12-1.1.0\bin\windows目录下执行kafka-console-consumer.bat --zookeeper localhost:2181 --topic steven-stream-out命令,创建消费者。
kafka-console-consumer.bat --zookeeper localhost:2181 --topic steven-stream-out
--property print.key=true --property print.value=true
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
--property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer

(3).生产者命令框中输入数据
hello world sherry
hello world steven
hello sherry steven

(4).消费者命令框中查看分析结果
hello 1
world 1
sherry 1
world 2
hello 3
sherry 2
steven 2
3.bug解决
(1).异常信息
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/kafka/common/requests/IsolationLevel
at org.apache.kafka.streams.StreamsConfig.<clinit>(StreamsConfig.java:730)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:544)
at com.example.study.kafka.StreamSample.main(StreamSample.java:38)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.requests.IsolationLevel
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 3
(2).解决方法
将kafka、kafka-clients以及kafka-streams版本调整一致。
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.12</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>2.3.0</version>
</dependency>