缓存(BlockCache)

为了提高Hbase集群的读写性能,官方团队设计了两种缓存策略,这里说的缓存就是Block Cache。关于BlockCache官方提供了两种策略,堆内(on-heap)缓存LruBlockCache和BucketCache,其中BucketCache通常使用堆外(off-heap)内存。通常LruBlockCache被称为L1缓存,默认是开启的,建议不要关闭;BucketCache被称为L2缓存,开启L2缓存需要配置相关参数,例如hbase.bucketcache.combinedcache.enabled、hbase.bucketcache.ioengine和hbase.bucketcache.size。
LruBlockCache是默认的缓存,它在Java堆内存中管理。BucketCache通常使用堆外内存(off-heap),但是也可是使用文件形式(file-backed)或者堆内存(heap)。BucketCache与LruBlockCache相比,读取数据延迟要高一些,但是相对来说比较稳定,因为GC比LruBlockCache要少。因为缓存由LruBlockCache自己管理,而不是GC管理。

配置LruBlockCache

LruBlockCache默认时开启状态,由参数hfile.block.cache.size控制,取值范围0 ~ 1.0小数,表示占堆内存(heap-size)的百分比。通常MemCache和LruBlockCache之和要小于0.8,即hbase.regionserver.global.memstore.size + hfile.block.cache.size < 0.8。表示最多占用80%的堆内存,另外20%用作其他用途。

<property>
    <name>hbase.regionserver.global.memstore.size</name>
    <value>0.4</value>
  </property>
  <!--L1读缓存-->
  <property>
    <name>hfile.block.cache.size</name>
    <value>0.4</value>
  </property>

配置BucketCache

BucketCache默认也是开启的,表示和LruBlockCache协同工作。当开启LruBlockCache缓存时,意味着使用了L1+L2混合缓存模式,缓存统一由CombinedBlockCache进行管理。Data block(真实数据)存储在L2中,Meta block元信息、Index block索引信息、BLOOM block存储在L1中。
如果想某个表禁用L2缓存,可通过shell命令设置cacheDataInL1参数为true,或者代码中配置HColumnDescriptor.setCacheDataInL1(true)。

create 't', {NAME => 't', CONFIGURATION => {CACHE_DATA_IN_L1 => 'true'}}

BucketCache三种存储方式堆内存(on-heap)、堆外内存(off-heap)、文件(file)。下面介绍使用off-heap配置:

1、修改配置文件hbase-env.sh,设置堆外内存
HBASE_OFFHEAPSIZE=16G
2、修改配置文件hbase-site.xml

<!--L2读缓存-->
  <!--开启L2缓存  2.x版本之后废除该参数-->
  <property>
    <name>hbase.bucketcache.combinedcache.enabled</name>
    <value>true</value>
    <description>
	Whether or not the bucketcache is used in league with the LRU on-heap block cache. In this mode, indices and blooms are kept in the LRU blockcache and the data blocks are kept in the bucketcache
	</description>
  </property>
  <property>
    <name>hbase.bucketcache.ioengine</name>
    <value>offheap</value>
  </property>
  <property>
    <name>hbase.bucketcache.size</name>
    <value>34816</value>
	<description>
	A float that EITHER represents a percentage of total heap memory size to give to the cache (if less than 1.0) OR, it is the total capacity in megabytes of BucketCache. Default: 0.0
	</description>
  </property>

读写性能调整

根据集群环境和实际应用场景,往往需要调整一些参数,使得集群能够发挥最大效率。
读多写少型:

适当减小hbase.regionserver.global.memstore.size,让MemCache内存小一些
适当增加hfile.block.cache.size
适当调整hbase.hregion.memstore.flush.size
其他调整Companct相关参数

读少写多型:

适当增加hbase.regionserver.global.memstore.size,让MemCache内存大一些
适当减小hfile.block.cache.size
增加客户端buffersize
如不考虑安全关闭WAL

GC调优

HBASE_MASTER_JAVA_OPTS="-XX:MaxPermSize=256m -XX:SurvivorRatio=2 -XX:+UseParNewGC -XX:ParallelGCThreads=12 -XX:+UseConcMarkSweepGC -XX:ParallelCMSThreads=16 -XX:+CMSParallelRemarkEnabled -XX:MaxTenuringThreshold=15 -XX:+UseCMSCompactAtFullCollection -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -XX:-DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError  -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps"
HBASE_REGIONSERVER_JAVA_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+UnlockExperimentalVMOptions  -XX:G1NewSizePercent=8 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ParallelRefProcEnabled -XX:-ResizePLAB -XX:ConcGCThreads=4 -XX:ParallelGCThreads=16 -XX:MaxTenuringThreshold=1 -XX:G1HeapRegionSize=32m -XX:G1MixedGCCountTarget=64 -XX:G1OldCSetRegionThresholdPercent=5 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy"

Java批量写入

final BufferedMutator.ExceptionListener listener = (e, mutator) -> {
            for (int i = 0; i < e.getNumExceptions(); i++) {
                System.out.println("Failed to sent put <<" + e.getRow(i) + ">> to Hbase...");
            }
        };
        BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf(tablename))
                .listener(listener);
        params.writeBufferSize(10 * 1024 * 1024);

        final BufferedMutator mutator;
        try {
            mutator = connection.getBufferedMutator(params);
            mutator.mutate(puts);
            mutator.flush();
        } catch (IOException e) {
            e.printStackTrace();
        }