1.先附一张效果图

Flume实现两个agent级联采集_hdfs

Flume实现两个agent级联采集_大数据_02

  • 第一个agent负责收集文件当中的数据,通过网络发送到第二个agent当中去,第二个agent负责接收第一个agent发送的数据,并将数据保存到hdfs上面去

2.开始实操 先在两个节点安装Flume

Flume的入门安装教程

此时我们已经安装好了两个节点


  • 主节点 : node09
  • 从节点 : node10

第一步:node10配置flume配置文件

1.进入到Flume安装目录的conf目录下

2.编辑 tail-avro-avro-logger.conf 文件

vim tail-avro-avro-logger.conf

3.覆盖原来的 并添加一下内容

##################
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /export/taillogs/access_log

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

##sink端的avro是一个数据发送者
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = 192.168.52.120
a1.sinks.k1.port = 4141

#Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

第二步:添加一个追加脚本

mkdir -p /export/shells/
cd /export/shells/
vim tail-file.sh

添加一下内容

#!/bin/bash
while true
do
date >> /export/servers/taillogs/access_log;
sleep 0.5;
done

配置node09的Flume

  • 在Flume安装目录下的conf目录中编辑 avro-hdfs.conf
vim avro-hdfs.conf
  • 添加以下内容
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

##source中的avro组件是一个接收者服务
a1.sources.r1.type = avro
a1.sources.r1.bind = 192.168.52.120
a1.sources.r1.port = 4141

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://node01:8020/avro

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3.按照顺序启动

  • 先启动 node09
cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin
bin/flume-ng agent -c conf -f conf/avro-hdfs.conf -n a1 -Dflume.root.logger=INFO,console
  • 再启动 node10
cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/
bin/flume-ng agent -c conf -f conf/tail-avro-avro-logger.conf -n a1 -Dflume.root.logger=INFO,console

4.测试

  • node02机器启shell脚本生成文件
mkdir /export/taillogs/
cd /export/servers/shells
sh tail-file.sh

5.结果如下:

Flume实现两个agent级联采集_数据_03