4-1 -课程目录

 

分布式消息队列Kafka

Kafka概述             Kafka架构及核心概念   Kafka部署及使用  

Kafka容错性测试  Kafka API编程          Kafka实战

4-2 -Kafka概述

官网:http://kafka.apache.org/

 和信息系统类似

信息中间者:生产者和消费者

馒头铺:生产者

你:消费者

馒头:数据流

正常情况下:生产者一个 消费一个

其他情况:

1、 一直生产,你吃到某个馒头时,你卡住了(机器故障),馒头就丢失了。

2、一直生产,做馒头速度快,你吃都来不及,馒头也就丢失了。

解决方案:

拿个碗/篮子,馒头做好以后先放到篮子里面,你要吃的时候去篮子里面取出来吃

篮子/碗:就是Kafka

当篮子满了,馒头就装不下了,咋办?

多准备几个篮子===kafka的扩容。

4-3 -Kafka架构及核心概念

 

前端订阅kafka消息 kafka消息订阅和推送_前端订阅kafka消息

Kafka架构

producer:生产者,就是生产馒头(生产者)

consumer:消费者,就是吃馒头

broker:篮子

topic:主题,给馒头带个标签,topica的馒头是给你吃的, topicb的馒头是给你弟弟吃的

First a few concepts:

  • Kafka is run as a cluster on one or more servers that can span multiple datacenters.
  • The Kafka cluster stores streams of records in categories called topics.
  • Each record consists of a key, a value, and a timestamp.

4-4 -Kafka单节点单Broker部署之Zookeeper安装

 

Kafka部署及使用

单节点单Broker部署及使用

单节点多Broker部署及使用

多节点多Broker部署及使用

可以参照:http://kafka.apache.org/quickstart

Step 1: Download the code

Download the 1.1.0 release and un-tar it.

tar -xzf kafka_2.11-1.1.0.tgz

cd kafka_2.11-1.1.0

Step 2: Start the server

Kafka uses ZooKeeper so you need to first start a ZooKeeper server 

安装zookeeper

tar -zxvf zookeeper-3.4.5-cdh5.7.0.tar.gz ~c app

配置zookeeper环境变量

vi ~/.bash_profile

export ZK_HOME=/home/hadoop/app/zookeeper-3.4.5-cdh5.7.0

export PHTH=$ZK_HOME/bin:$PATH

source ~/.base_profile

编辑zoo_sample.cfg

配置zookeeper目录

dataDir=/home/hadoop/app/tep/zk

启动zookeeper

bin

./zkServer.sh

 

4-5 -Kafka单节点单broker的部署及使用

 

查看官网:

http://kafka.apache.org/quickstart

步骤1、获取Kafka

wget https://archive.apache.org/dist/kafka/0.8.2.2/kafka_2.9.1-0.8.2.2.tgz

步骤2、解压kafka

tar -zxvf kafka_2.9.1-0.8.2.2.tgz -C ~/app

步骤3、配置环境变量 vi ~/.bash_profile

export KAFKA_HOME=/home/hadoop/app/kafka_2.11-0.9.0.0

export PHTH=$KAFKA_HOME/bin:$PATH

source ~/.base_profile

步骤4、配置kafka的配置信息

broker.id=0

listeners

host.name

log.dirs=/home/hadoop/app/tmp/kafka.logs

zookeeper.connect

启动kafka

bin/kafka-server-start.sh config/server.properties

 

前端订阅kafka消息 kafka消息订阅和推送_zookeeper_02

kafka-server-start.sh $KAFKA_HOME/config/server.properties

步骤5、验证是否启动起来

jps-->出现 Kafka进程

步骤6、创建topic

bin/kafka-topics.sh --create --zookeeper hadoop:2181 --replication-factor 1 --partitions 1 --topic hello_topic

步骤7、查看所有的topic zk

bin/kafka-topics.sh --list --zookeeper localhost:2181

步骤8、发送消息 broker

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic hello_topic

步骤9、消费消息 zk

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic hello_topic--from-beginning

 

4-6 -Kafka单节点多broker部署及使用

参考:http://kafka.apache.org/quickstart

Step 6: Setting up a multi-broker cluster

First we make a config file for each of the brokers (on Windows use the copy command instead):

cp config/server.properties config/server-1.properties
cp config/server.properties config/server-2.properties

Now edit these new files and set the following properties:

config/server-1.properties:
broker.id=1
listeners=PLAINTEXT://:9093
log.dir=/tmp/kafka-logs-1
 
config/server-2.properties:
broker.id=2
listeners=PLAINTEXT://:9094
log.dir=/tmp/kafka-logs-2

The broker.id property is the unique and permanent name of each node in the cluster.

bin/kafka-server-start.sh config/server-1.properties & bin/kafka-server-start.sh config/server-2.properties

Now create a new topic with a replication factor of three:

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic

Okay but now that we have a cluster how can we know which broker is doing what? To see that run the "describe topics" command:

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 Configs:
Topic: my-replicated-topic  Partition: 0    Leader: 1   Replicas: 1,2,0 Isr: 1,2,0

We can run the same command on the original topic we created to see where it is:

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
Topic:test  PartitionCount:1    ReplicationFactor:1 Configs:
Topic: test Partition: 0    Leader: 0   Replicas: 0 Isr: 0
Let's publish a few messages to our new topic:bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic
...
my test message 1
my test message 2
^C

Now let's consume these messages:

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my-replicated-topic
...
my test message 1
my test message 2
^C
....

4-7 -Kafka容错性测试

 

三个节点的,删除其中二个节点,可以选择一个节点来维持正常使用,所以容错性很好。

 

4-8 -使用IDEA+Maven构建开发环境

Kafka API编程

IDEA+Maven构建开发环境

Producer API使用

Consumer API使用

代码源码地址:

https://gitee.com/sag888/big_data/tree/master/Spark%20Streaming%E5%AE%9E%E6%97%B6%E6%B5%81%E5%A4%84%E7%90%86%E9%A1%B9%E7%9B%AE%E5%AE%9E%E6%88%98/project/l2118i/sparktrain

 

一、引入依赖

<!-- Kafka 依赖-->
<!--
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.11</artifactId>
<version>${kafka.version}</version>
</dependency>
-->


二、创建项目目录结构:可参照项目

 

4-9 -Kafka Producer Java API编程

 

代码地址:

https://gitee.com/sag888/big_data/tree/master/Spark%20Streaming%E5%AE%9E%E6%97%B6%E6%B5%81%E5%A4%84%E7%90%86%E9%A1%B9%E7%9B%AE%E5%AE%9E%E6%88%98/project/l2118i/sparktrain/src/main/java/com/imooc/spark/kafka

 

源码:

第一步:配置信息
package com.imooc.spark.kafka;
/**
* Kafka常用配置文件
*/
public class KafkaProperties {
public static final String ZK = "192.168.199.111:2181";
public static final String TOPIC = "hello_topic";
public static final String BROKER_LIST = "192.168.199.111:9092";
public static final String GROUP_ID = "test_group1";
}
 
第二步:kafka生产者
package com.imooc.spark.kafka;
import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;
import java.util.Properties;
/**
* Kafka生产者
*/
public class KafkaProducer extends Thread{
private String topic;
private Producer<Integer, String> producer;
public KafkaProducer(String topic) {
this.topic = topic;
Properties properties = new Properties();
properties.put("metadata.broker.list",KafkaProperties.BROKER_LIST);
properties.put("serializer.class","kafka.serializer.StringEncoder");
properties.put("request.required.acks","1");
producer = new Producer<Integer, String>(new ProducerConfig(properties));
}
@Override
public void run() {
int messageNo = 1;
while(true) {
String message = "message_" + messageNo;
producer.send(new KeyedMessage<Integer, String>(topic, message));
System.out.println("Sent: " + message);
messageNo ++ ;
try{
Thread.sleep(2000);
} catch (Exception e){
e.printStackTrace();
}
}
}
}
第四步、测试类
package com.imooc.spark.kafka;
/**
* Kafka Java API测试
*/
public class KafkaClientApp {
public static void main(String[] args) {
new KafkaProducer(KafkaProperties.TOPIC).start();
}
}


第五步:客服端测试

前端订阅kafka消息 kafka消息订阅和推送_kafka_03

 

 

 

4-9 -Kafka Producer Java API编程

 

代码地址:

https://gitee.com/sag888/big_data/tree/master/Spark%20Streaming%E5%AE%9E%E6%97%B6%E6%B5%81%E5%A4%84%E7%90%86%E9%A1%B9%E7%9B%AE%E5%AE%9E%E6%88%98/project/l2118i/sparktrain/src/main/java/com/imooc/spark/kafka

 

 

第一步:kafka消费者
package com.imooc.spark.kafka;
import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
/**
* Kafka消费者
*/
public class KafkaConsumer extends Thread{
private String topic;
public KafkaConsumer(String topic) {
this.topic = topic;
}
private ConsumerConnector createConnector(){
Properties properties = new Properties();
properties.put("zookeeper.connect", KafkaProperties.ZK);
properties.put("group.id",KafkaProperties.GROUP_ID);
return Consumer.createJavaConsumerConnector(new ConsumerConfig(properties));
}
@Override
public void run() {
ConsumerConnector consumer = createConnector();
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topic, 1);
// topicCountMap.put(topic2, 1);
// topicCountMap.put(topic3, 1);
// String: topic
// List<KafkaStream<byte[], byte[]>> 对应的数据流
Map<String, List<KafkaStream<byte[], byte[]>>> messageStream = consumer.createMessageStreams(topicCountMap);
KafkaStream<byte[], byte[]> stream = messageStream.get(topic).get(0); //获取我们每次接收到的数据
ConsumerIterator<byte[], byte[]> iterator = stream.iterator();
while (iterator.hasNext()) {
String message = new String(iterator.next().message());
System.out.println("rec: " + message);
}
}
}
第二步:测试
package com.imooc.spark.kafka;
/**
* Kafka Java API测试
*/
public class KafkaClientApp {
public static void main(String[] args) {
new KafkaProducer(KafkaProperties.TOPIC).start();
new KafkaConsumer(KafkaProperties.TOPIC).start();
}
}

 

 

前端订阅kafka消息 kafka消息订阅和推送_kafka_04

 

4-11 -Kafka实战之整合Flume和Kafka完成实时数据采集

前端订阅kafka消息 kafka消息订阅和推送_spark_05

 

前端订阅kafka消息 kafka消息订阅和推送_kafka_06