FileChannel在Flume是一个非常重要的Channel,FileChannel可以很好的保证数据的完整性和一致性,提供了类似mysql binlog的机制,保证机器down机,JVM异常退出时数据不丢失,在采集数据量很大的情况下,建议FileChannel设置的目录和程序日志文件保存的目录设成不同的磁盘,以便提高效率。

FileChannel的简易类结构:

flume写入orc_ide

 

 

 

FileChannel的内部事务类,FileBackedTransaction:

flume写入orc_flume写入orc_02

 

文件操作类:LogFile(LogFileV2在1.7已经被舍弃):

flume写入orc_ci_03

还有其他几个比较重要的类:

FlumeEventQueue,LogFile,Log,LogUtils。

 

一,初始化过程:public void configure(Context context)

1,useDualCheckpoints(是否需要备份检查点)

2,compressBackupCheckpoint(是否压缩备份节点)

3,checkpointDir(检查点目录,默认在${user.home}目录下)

4,dataDirs(数据节点目录)

5,capacity(获取配置的容量)

6,keepAlive(超时时间,就是如果channel中没有数据最长等待时间)

7,transactionCapacity(事务的最大容量)

注意:capacity的值一定要大于transactionCapacity,不然会报错,看源码:

Preconditions.checkState(transactionCapacity <= capacity,
      "File Channel transaction capacity cannot be greater than the " +
        "capacity of the channel.");

8,checkpointInterval(log的检查间隔)

9,maxFileSize(最大文件的大小,默认是1.5G)

10,minimumRequiredSpace(最少需要多少空间,默认是500M)

11,useLogReplayV1(使用旧重放逻辑)

12,useFastReplay(不使用队列重放)

13,keyProvider(KEY供应商的类型,支持的类型:JCEKSFILE)

14,activeKey(用于加密新数据的密钥名称)

15,cipherProvider(加密提供程序类型,支持的类型:AESCTRNOPADDING)

 

二,start()方法:

@Override
  public synchronized void start() {
    LOG.info("Starting {}...", this);
    try {
      Builder builder = new Log.Builder();
      builder.setCheckpointInterval(checkpointInterval);
      builder.setMaxFileSize(maxFileSize);
      builder.setMinimumRequiredSpace(minimumRequiredSpace);
      builder.setQueueSize(capacity);
      builder.setCheckpointDir(checkpointDir);
      builder.setLogDirs(dataDirs);
      builder.setChannelName(getName());
      builder.setUseLogReplayV1(useLogReplayV1);
      builder.setUseFastReplay(useFastReplay);
      builder.setEncryptionKeyProvider(encryptionKeyProvider);
      builder.setEncryptionKeyAlias(encryptionActiveKey);
      builder.setEncryptionCipherProvider(encryptionCipherProvider);
      builder.setUseDualCheckpoints(useDualCheckpoints);
      builder.setCompressBackupCheckpoint(compressBackupCheckpoint);
      builder.setBackupCheckpointDir(backupCheckpointDir);
      builder.setFsyncPerTransaction(fsyncPerTransaction);
      builder.setFsyncInterval(fsyncInterval);
      builder.setCheckpointOnClose(checkpointOnClose);//以上是将configure方法获取到的参数,set到Builder对象
      log = builder.build();
      //builder.build();方法通过Builder创建Log对象
      //并且尝试获取checkpointDir和dataDir文件锁,Log类中的private void lock(File dir) throws IOException方法就是用来尝试过去锁的
      log.replay();
      //1,首先获取到checkpointDir的写锁
      //2,获取最大的fileID
      //3,读取log文件根据record的类型进行相应的操作,进行恢复;遍历所有的data目录
      //4,将queue刷新到相关文件
      open = true;//表示打开channel 
      int depth = getDepth();
      
      Preconditions.checkState(queueRemaining.tryAcquire(depth),
          "Unable to acquire " + depth + " permits " + channelNameDescriptor);
      LOG.info("Queue Size after replay: " + depth + " "
           + channelNameDescriptor);
    } catch (Throwable t) {
      open = false;
      startupError = t;
      LOG.error("Failed to start the file channel " + channelNameDescriptor, t);
      if (t instanceof Error) {
        throw (Error) t;
      }
    }
    if (open) {
    //计数器开始统计
      channelCounter.start();
      channelCounter.setChannelSize(getDepth());
      channelCounter.setChannelCapacity(capacity);
    }
    super.start();
  }

org.apache.flume.channel.file.Log类用来将Event写入磁盘并将指向这些event的指针存入一个内存队列FlumeEventQueue中。并且启动一个线程,每过checkpointInterval毫秒写一次检查点log.writeCheckpoint()。

workerExecutor.scheduleWithFixedDelay(new BackgroundWorker(this),
        this.checkpointInterval, this.checkpointInterval,
        TimeUnit.MILLISECONDS);

 

static class BackgroundWorker implements Runnable {
    private static final Logger LOG = LoggerFactory
        .getLogger(BackgroundWorker.class);
    private final Log log;

    public BackgroundWorker(Log log) {
      this.log = log;
    }

    @Override
    public void run() {
      try {
        if (log.open) {
          log.writeCheckpoint();
          //将checpoint、inflightTakes、inflightPuts都刷新至磁盘,先后将inflightPuts、inflightTakes、checkpoint.meta重建,
          //更新checkpoint文件并刷新至磁盘,这些文件都在checkpointDir目录下;更新log-ID.meta文件;同时肩负起删除log文件及其对应的meta文件的责任。
        }
      } catch (IOException e) {
        LOG.error("Error doing checkpoint", e);
      } catch (Throwable e) {
        LOG.error("General error in checkpoint worker", e);
      }
    }
  }

三,事务

很多方法和Memory的事务类相似。如:doTake(),doCommit(),doRollback(),doPut()

下面详细的介绍这几个方法。

1,doPut():source会调用put方法

@Override
    protected void doPut(Event event) throws InterruptedException {
      channelCounter.incrementEventPutAttemptCount();
      if(putList.remainingCapacity() == 0) {//是否有剩余空间
        throw new ChannelException("Put queue for FileBackedTransaction " +
            "of capacity " + putList.size() + " full, consider " +
            "committing more frequently, increasing capacity or " +
            "increasing thread count. " + channelNameDescriptor);
      }
      // this does not need to be in the critical section as it does not
      // modify the structure of the log or queue.
      if(!queueRemaining.tryAcquire(keepAlive, TimeUnit.SECONDS)) {//尝试等待
        throw new ChannelFullException("The channel has reached it's capacity. "
            + "This might be the result of a sink on the channel having too "
            + "low of batch size, a downstream system running slower than "
            + "normal, or that the channel capacity is just too low. "
            + channelNameDescriptor);
      }
      boolean success = false;
      log.lockShared();//获取checkpoint的读锁,doTake()方法也会获取读锁,所以doTake和doPut只能操作一个,无法同时操作。
      try {
       //transactionID是在TransactionIDOracle类中递增的
        FlumeEventPointer ptr = log.put(transactionID, event);//将Event写入数据文件,使用RandomAccessFile。数据会缓存到inflightputs文件中
        Preconditions.checkState(putList.offer(ptr), "putList offer failed "
          + channelNameDescriptor);
        queue.addWithoutCommit(ptr, transactionID);//指针和事务ID加入到queue队列中。
        success = true;
      } catch (IOException e) {
        throw new ChannelException("Put failed due to IO error "
                + channelNameDescriptor, e);
      } finally {
        log.unlockShared();//释放读锁
        if(!success) {
          // release slot obtained in the case
          // the put fails for any reason
          queueRemaining.release();//释放信号量
        }
      }
    }

2,doTake():sink会调用put方法

<pre name="code" class="java">    protected Event doTake() throws InterruptedException {
      channelCounter.incrementEventTakeAttemptCount();
      if(takeList.remainingCapacity() == 0) {
        throw new ChannelException("Take list for FileBackedTransaction, capacity " +
            takeList.size() + " full, consider committing more frequently, " +
            "increasing capacity, or increasing thread count. "
               + channelNameDescriptor);
      }
      log.lockShared();//获取锁
      /*
       * 1. Take an event which is in the queue.
       * 2. If getting that event does not throw NoopRecordException,
       *    then return it.
       * 3. Else try to retrieve the next event from the queue
       * 4. Repeat 2 and 3 until queue is empty or an event is returned.
       */

      try {
        while (true) {
          FlumeEventPointer ptr = queue.removeHead(transactionID);//获取文件指针,ptr的数据结构是fileID和offset
          if (ptr == null) {
            return null;
          } else {
            try {
              // first add to takeList so that if write to disk
              // fails rollback actually does it's work
              Preconditions.checkState(takeList.offer(ptr),
                "takeList offer failed "
                  + channelNameDescriptor);
              log.take(transactionID, ptr); // write take to disk
              Event event = log.get(ptr);//根据文件指针,使用log对象在磁盘中获取到Event。数据会缓存到inflighttakes文件中
              return event;
            } catch (IOException e) {
              throw new ChannelException("Take failed due to IO error "
                + channelNameDescriptor, e);
            } catch (NoopRecordException e) {
              LOG.warn("Corrupt record replaced by File Channel Integrity " +
                "tool found. Will retrieve next event", e);
              takeList.remove(ptr);
            } catch (CorruptEventException ex) {
              if (fsyncPerTransaction) {
                throw new ChannelException(ex);
              }
              LOG.warn("Corrupt record found. Event will be " +
                "skipped, and next event will be read.", ex);
              takeList.remove(ptr);
            }
          }
        }
      } finally {
        log.unlockShared();//释放锁
      }
    }

3,doCommit():source和sink都会调用该方法提交事务

@Override
    protected void doCommit() throws InterruptedException {
      int puts = putList.size();
      int takes = takeList.size();
      if(puts > 0) {//puts和takes不能同时都>0,其中有一个得是等于零
        Preconditions.checkState(takes == 0, "nonzero puts and takes "
                + channelNameDescriptor);
        log.lockShared();//获取锁
        try {
          log.commitPut(transactionID);//该操作会封装成一个ByteBuffer类型写入到文件,
          channelCounter.addToEventPutSuccessCount(puts);
          synchronized (queue) {
            while(!putList.isEmpty()) {
              if(!queue.addTail(putList.removeFirst())) {
                StringBuilder msg = new StringBuilder();
                msg.append("Queue add failed, this shouldn't be able to ");
                msg.append("happen. A portion of the transaction has been ");
                msg.append("added to the queue but the remaining portion ");
                msg.append("cannot be added. Those messages will be consumed ");
                msg.append("despite this transaction failing. Please report.");
                msg.append(channelNameDescriptor);
                LOG.error(msg.toString());
                Preconditions.checkState(false, msg.toString());
              }
            }
            queue.completeTransaction(transactionID);//清空checkpoint文件夹中inflightputs和inflighttakes文件的内容
          }
        } catch (IOException e) {
          throw new ChannelException("Commit failed due to IO error "
                  + channelNameDescriptor, e);
        } finally {
          log.unlockShared();//释放锁
        }

      } else if (takes > 0) {
        log.lockShared();//释放锁
        try {
          log.commitTake(transactionID);//写入data文件
          queue.completeTransaction(transactionID);//和上面操作一样
          channelCounter.addToEventTakeSuccessCount(takes);
        } catch (IOException e) {
          throw new ChannelException("Commit failed due to IO error "
              + channelNameDescriptor, e);
        } finally {
          log.unlockShared();
        }
        queueRemaining.release(takes);
      }
      putList.clear();
      takeList.clear();//清空两个队列
      channelCounter.setChannelSize(queue.getSize());
    }

4,doRollback():source和sink都会调用该方法回滚数据

@Override
    protected void doRollback() throws InterruptedException {
      int puts = putList.size();
      int takes = takeList.size();
      log.lockShared();
      try {
        if(takes > 0) {
          Preconditions.checkState(puts == 0, "nonzero puts and takes "
              + channelNameDescriptor);
          synchronized (queue) {
            while (!takeList.isEmpty()) {
              Preconditions.checkState(queue.addHead(takeList.removeLast()),
                  "Queue add failed, this shouldn't be able to happen "
                      + channelNameDescriptor);
            }
          }
        }
        putList.clear();
        takeList.clear();
        queue.completeTransaction(transactionID);
        channelCounter.setChannelSize(queue.getSize());
        log.rollback(transactionID);//也是封装成ByteBuffer,写入到缓存文件中。
      } catch (IOException e) {
        throw new ChannelException("Commit failed due to IO error "
            + channelNameDescriptor, e);
      } finally {
        log.unlockShared();
        // since rollback is being called, puts will never make it on
        // to the queue and we need to be sure to release the resources
        queueRemaining.release(puts);
      }
    }

Flame的FileChannel在系统崩溃的时候保证数据的完整性和一致性,其实是通过JDK的字节通道实现的(java.nio.channels.FileChannel),字节通道为了保证数据在系统崩溃之后不丢失数据,文件的修改模式会被强制到底层存储设备。

 

最后看下Flume FileChannel的文件结构:

checkpoint目录:

flume写入orc_ide_04

checkpoint:存放Event在那个data文件logFileID的什么位置offset等信息。

inflighttakes:存放的是事务take的缓存数据,每隔段时间就重建文件。

内容:

1、16字节是校验码;

2、transactionID1+eventsCount1+eventPointer11+eventPointer12+...;

3、transactionID2+eventsCount2+eventPointer21+eventPointer22+...

inflightputs:存放的是事务对应的put缓存数据,每隔段时间就重建文件。

内容:

1、16字节是校验码;

2、transactionID1+eventsCount1+eventPointer11+eventPointer12+...;

3、transactionID2+eventsCount2+eventPointer21+eventPointer22+...

checkpoint.meta:主要存储的是logfileID及对应event的数量等信息。

data目录:

flume写入orc_读锁_05

log-ID.meta:主要记录log-ID下一个写入位置以及logWriteOrderID等信息。

log-ID:数据文件,目录里数据文件保持不超过2个。

 

FileChannel实现比较复杂,先写这么多,以后有需要细细了解。