记录一下Hive的安装和常用的三种交互方式的使用,参考文末博文和老王,需要提前安装好MySQL和配置好Hadoop集群。

使用版本:

(1)MySQL:5.7.28

(2)Hadoop:2.6.0-cdh5.14.2

(3)Hive:1.1.0-cdh5.14.2

Hive的安装

Hive的安装,需要完成MySQL的安装、Hadoop的配置、Hive中conf文件的配置、Hive中日志路径的配置。

MySQL的安装

Hadoop的安装

Hive的安装

(1)下载hive的安装包,地址:http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.2.tar.gz

# wget下载,下载到当前目录/kkb/install
[hadoop@node01 /kkb/install]$ wget http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.2.tar.gz

(2)解压安装包到指定的目录

# 解压到安装目录
tar -zxvf hive-1.1.0-cdh5.14.2.tar.gz -C /kkb/install/

(3)修改hive/conf/hive-env.sh

进入hive安装目录的conf目录,修改hive-env.sh,配置HADOOP_HOME和HIVE_CONF_DIR路径。

# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/kkb/install/hadoop-2.6.0-cdh5.14.2/

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/kkb/install/hive-1.1.0-cdh5.14.2/conf

(4)vim新增hive/conf/hive-site.xml

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>javax.jdo.option.ConnectionURL</name>
                <value>jdbc:mysql://node03:3306/hive?createDatabaseIfNotExist=true&characterEncoding=latin1&useSSL=false</value>
        </property>

        <property>
                <name>javax.jdo.option.ConnectionDriverName</name>
                <value>com.mysql.jdbc.Driver</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionUserName</name>
                <value>root</value>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionPassword</name>
                <value>123456</value>
        </property>
        <property>
                <name>hive.cli.print.current.db</name>
                <value>true</value>
        </property>
        <property>
                <name>hive.cli.print.header</name>
            <value>true</value>
        </property>
        <property>
                <name>hive.server2.thrift.bind.host</name>
                <value>node01.kaikeba.com</value>
        </property>
</configuration>

(5)修改hive/conf/hive-log4j.properties

复制hive/conf/ hive-log4j.properties.template为hive-log4j.properties,配置hive日志文件存储的地址。

# 配置日志文件存放地址
hive.log.dir=/kkb/install/hive-1.1.0-cdh5.14.2/logs/

(6)将mysql驱动包上传到hive的lib目录。

[hadoop@node01 /kkb/soft]$ cp mysql-connector-java-5.1.38.jar /kkb/install/hive-1.1.0-cdh5.14.2/lib/

这样hive的安装就完成了,接下来使用三种交互方式测试一下。

Hive的交互方式

hive有三种交互方式 ,使用之前需要先启动hadoop集群。

Hive CLI

直接执行hive/bin/hive脚本,从WARNING提示可以看出,一般不推荐这种方式。

[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-09-30 16:55:28,709 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/30 16:55:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
# Hive CLI的方式不推荐
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
# 查询数据库
hive> show databases;
OK
default
Time taken: 8.535 seconds, Fetched: 1 row(s)
hive>

另外,hive CLI命令窗口下可以直接查看本地以及HDFS文件系统。

[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-09-30 17:24:31,509 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/30 17:24:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
# 查看本地文件系统
hive> !ls /kkb/install;
hadoop-2.6.0-cdh5.14.2
hbase-1.2.0-cdh5.14.2
hive-1.1.0-cdh5.14.2
hive-1.1.0-cdh5.14.2.tar.gz
hive.sql
jdk1.8.0_181
zookeeper-3.4.5-cdh5.14.2
# 可以查看hdfs文件系统
hive> dfs -ls /;
Found 8 items
-rw-r--r--   3 hadoop supergroup      13612 2020-03-09 10:42 /dataskew.txt
drwxr-xr-x   - root   supergroup          0 2020-03-09 10:48 /dataskewOutput
drwxr-xr-x   - hadoop supergroup          0 2020-03-09 16:13 /ncdcDataWithTotalOrder
drwxr-xr-x   - root   supergroup          0 2020-03-09 15:35 /ncdcDataclean
drwxr-xr-x   - hadoop supergroup          0 2020-03-09 15:32 /ncdcdata
-rw-r--r--   3 root   supergroup     212005 2020-03-09 14:21 /sequencefile
drwx------   - hadoop supergroup          0 2020-09-30 16:52 /tmp
drwx------   - hadoop supergroup          0 2020-09-30 17:19 /user
hive>

beeline

使用beeline方式连接,需先运行hiveserver2,启动后使用jps查看就是runjar,这种方式比较常用,在开发阶段会大量使用。

# 本次是后台启动,也可以前台启动,使用./hive --service hiveserver2
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ nohup ./hive --service hiveserver2 &
[1] 10103
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ nohup: ignoring input and appending output to ‘nohup.out’
# 有RunJar即启动了hiveserver2,进程号为10103
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ jps
8705 ResourceManager
8228 NameNode
10103 RunJar
8476 SecondaryNameNode
8812 NodeManager
8334 DataNode
10271 Jps

使用beeline连接hiveserver2,如果是上面是后台启动,直接在当前窗口连接就行,前台启动的话就需要另外开一个窗口连接。

# 启动beeline
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./beeline 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-09-30 17:07:31,569 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Beeline version 1.1.0-cdh5.14.2 by Apache Hive
# 使用jdbc连接
beeline> !connect jdbc:hive2://node01:10000
scan complete in 2ms
Connecting to jdbc:hive2://node01:10000
# 直接回车,不用输用户名
Enter username for jdbc:hive2://node01:10000: 
# 直接回车,不用输密码
Enter password for jdbc:hive2://node01:10000: 
Connected to: Apache Hive (version 1.1.0-cdh5.14.2)
Driver: Hive JDBC (version 1.1.0-cdh5.14.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
# 查询数据库
0: jdbc:hive2://node01:10000> show databases;
INFO  : Compiling command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868): show databases
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868); Time taken: 1.72 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868): show databases
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868); Time taken: 0.157 seconds
INFO  : OK
+----------------+--+
| database_name  |
+----------------+--+
| default        |
+----------------+--+
1 row selected (8.47 seconds)

hive命令

(1)hive -e "hql"

使用这种方式可以直接执行hql语句。

# 直接查询数据库
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -e "show databases;"
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-09-30 17:15:00,248 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/30 17:15:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
OK
default
Time taken: 7.221 seconds, Fetched: 1 row(s)

(2)hive -f sql脚本文件

一般在开发完成后,可以把hql写入到脚本里,然后使用这种方式来执行。

可以在/kkb/install下使用vim命令新建一段脚本,内容如下。

create database if not exists youngchaolin_hive;

然后使用hive -f命令来执行这个脚本。

# 执行脚本,文件l不需要为可执行文件
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -f /kkb/install/hive.sql 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-09-30 17:19:48,072 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/30 17:19:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
OK
Time taken: 5.016 seconds
# 查看结果,发现已成功创建数据库
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -e "show databases;"
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-09-30 17:20:12,228 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/30 17:20:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
OK
default
youngchaolin_hive
Time taken: 4.062 seconds, Fetched: 2 row(s)

新建的数据库,默认路径在hdfs的 /user/hive/warehouse/youngchaolin_hive.db。

Hive 使用yum 安装CDH版本 cdh安装hive步骤_Hive 使用yum 安装CDH版本

 

以上,理解不一定正确,学习就是一个不断认识和纠错的过程,如果有误还请批评指正。