一、 Hive整合Hbase参考连接:
连接地址:
二、 hive配置Metastore
2、 配置Metastore
Hadoop配置
进入到hadoop软件的配置目录
# cd $HADOOP_HOME/etc/hadoop
# vim core-site.xml
# 添加如下配置
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
注意:如果你使用的hadoop用户则name配置hadoop.proxyuser.hadoop.hosts和hadoop.proxyuser.hadoop.groups
Hive配置
进入到Hive软件的配置目录
不设置用户名密码的配置,连接Hbase的配置
# cd $HIVE_HOME/conf
# vim hive-site.xml
# 添加如下内容
<property>
<name>hive.metastore.uris</name>
<value>thrift://h1:9083</value>
</property>、
<!--
如果连接HBase需要将beeline连接hiveserver2的用户名密码禁用(暂时先这样设置)
-->
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>
如果为True:Hive Server会以提交用户的身份去执行语句
如果为False:会以hive server daemon的admin user来执行语句
</description>
</property>
设置用户名密码的配置,不连接Hbase的配置
# cd $HIVE_HOME/conf
# vim hive-site.xml
# 添加如下内容
<property>
<name>hive.metastore.uris</name>
<value>thrift://h1:9083</value>
</property>、
<!--配置hiveserver2用户名密码,目的是对外屏蔽到mysql真实的用户名密码 -->
<property>
<name>hive.server2.thrift.client.user</name>
<value>root</value>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>root</value>
</property>
至此配置完毕
三、启动
1、 启动Metastore
# 前台启动
# hive --service metastore
2019-03-28 20:40:45: Starting Hive Metastore Server
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/software/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/software/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
# 或后台启动
# hive --service metastore &
2、 启动hiveserver2
# hive --service hiveserver2
# 或者(因为已经配置了环境变量,在hive的bin目录下有hiveserver2脚本)
# hiveserver2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/software/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/software/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 83b12586-6bbe-4222-aa75-0a5df1da9df3
Hive Session ID = 04106346-c9b8-47ed-9be4-3c4c7dcc8804
Hive Session ID = 2f8adbe5-9f55-4a5d-84f5-cfd1d8865260
Hive Session ID = 111e0ec4-5137-4397-9f5b-cca045226581
OK
注意:由于hiveserver2启动比较慢,可以通过以下命令检查
# netstat -anp|grep 10000
tcp 0 0 192.168.247.11:39554 192.168.247.11:10000 ESTABLISHED 31370/java
tcp6 0 0 :::10000 :::* LISTEN 31087/java
tcp6 0 0 192.168.247.11:10000 192.168.247.11:39554 ESTABLISHED 31087/java
3、 使用beeline连接
# 由于已经配置了hive环境变量,所有可用直接使用beeline命令
# 也可以进入到hive的bin目录下,执行 ./beeline
# beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/software/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/software/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.1 by Apache Hive
beeline>
# 使用jdbc连接,期间会让输入用户名密码,由于已经配置了无需用户名密码,所以直接回车(Enter)即可
beeline> !connect jdbc:hive2://h1:10000
Connecting to jdbc:hive2://h1:10000
Enter username for jdbc:hive2://h1:10000:
Enter password for jdbc:hive2://h1:10000:
Connected to: Apache Hive (version 3.1.1)
Driver: Hive JDBC (version 3.1.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://h1:10000>
4、 测试
输入sql语句:show tables;
结果如下:
0: jdbc:hive2://h1:10000> show tables;
INFO : Compiling command(queryId=root_20190328204908_23db4e7d-bd3d-4023-b403-90d42b79b8db): show tables
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=root_20190328204908_23db4e7d-bd3d-4023-b403-90d42b79b8db); Time taken: 2.659 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20190328204908_23db4e7d-bd3d-4023-b403-90d42b79b8db): show tables
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=root_20190328204908_23db4e7d-bd3d-4023-b403-90d42b79b8db); Time taken: 0.119 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+----------------+
| tab_name |
+----------------+
| hbase_table_1 |
| spark_test |
| test |
+----------------+
3 rows selected (3.812 seconds)
其中test的是Hbase外部表,如果创建Hbase外部表,参考连接:
测试是否可用连接到hbase上
0: jdbc:hive2://h1:10000> select * from test;
INFO : Compiling command(queryId=root_20190328204920_ae673cf6-4b71-4976-a98d-6dffcf56b486): select * from test
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:test.key, type:string, comment:null), FieldSchema(name:test.id, type:int, comment:null), FieldSchema(name:test.name, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=root_20190328204920_ae673cf6-4b71-4976-a98d-6dffcf56b486); Time taken: 8.094 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=root_20190328204920_ae673cf6-4b71-4976-a98d-6dffcf56b486): select * from test
INFO : Completed executing command(queryId=root_20190328204920_ae673cf6-4b71-4976-a98d-6dffcf56b486); Time taken: 0.001 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+-----------+----------+------------+
| test.key | test.id | test.name |
+-----------+----------+------------+
| 1 | 1 | wx |
| 2 | 2 | user |
+-----------+----------+------------+
2 rows selected (43.231 seconds)
0: jdbc:hive2://h1:10000>
至此完毕
四、 总结
1、 异常一
Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Sat Jun 17 16:09:48 CST 2017, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68116: row 'hhh_tj_atmosphere_history,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=slave3,16020,1497683020356, seqNum=0
这个异常就是连接无法Hbase异常。
解决:
#beeline连接hbase时这里要设置为false
#默认情况下,HiveServer2以提交查询的用户执行查询(true),如果hive.server2.enable.doAs设置为false,查询将以运行hiveserver2进程的用户运行
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
2、 异常二
User: root is not allowed to impersonate anonymous (state=08S01,code=0)
此异常是在hiveserver2启动时报的异常
解决:
# 在hadoop的配置目录下的core-site.xml文件添加如下内容
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
# 注意如果是其它的User:XXX..........,则将下面的root改为对应的用户。如hadoop,则为
# 在hadoop的配置目录下的core-site.xml文件添加如下内容
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
备注:如果有其它问题,请多查看日志信息