环境准备
1,HDFS正常启动
2,Yarn正常启动
3,Mysql启动(用于存储元数据)
参考设置允许远程连接:
mysql> set global validate_password_policy=0;
mysql> set global validate_password_length=1;
mysql> grant all privileges on *.* to 'root'@'%' identified by '密码' with grant option;
mysql> flush privileges;
可以配置mysql的开机自启服务 chkconfig mysqld on
安装Hive
1,上传并解压安装包
tar - zxf 压缩包 -C 指定解压位置
2,修改配置文件
hive-env.sh:
cp hive-env.sh.template hive-env.sh
vi hive-env.sh
export HADOOP_HOME=/opt/app/hadoop-2.8.5/
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/opt/app/hive-2.3.1/conf/
hive-site.xml (conf目录下新建)
<configuration>
<!-- 记录HIve中的元数据信息 记录在mysql中 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://linux01:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- mysql的用户名和密码 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property><property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property><property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/tmp</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/user/hive/log</value>
</property>
<!-- 客户端远程连接的端口 -->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hive.server2.webui.host</name>
<value>0.0.0.0</value>
</property>
<!-- hive服务的页面的端口 -->
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>
<property>
<name>hive.server2.long.polling.timeout</name>
<value>5000</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property><property>
<name>hive.execution.engine</name>
<value>mr</value>
</property>
</configuration>
/opt/app/hadoop-2.8.5/etc/hadoop/core-site.xml
修改Hadoop配置文件,允许hive操作hdfs文件
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property><property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property><property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
3,拷贝mysql驱动包到hive的ib目录中
4,重启hdfs
stop-all.sh
start-all.sh
5,初始化hive的元数据库 (bin)
./schematool -initSchema -dbType mysql
执行成功后mysql中会新增一个数据库:
6,配置hive的环境变量
7,启动
hive
连接
1,本地直接连接客户端
hive
2,远程连接HiveServer2
- 启动hiveserver2
前台启动:hiveserver2
后台启动:hiveserver2 &
- 连接使用 beeline
[root@linux01 ~]# beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apps/hive-2.3.1/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/apps/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 2.3.1 by Apache Hive
beeline> !connect jdbc:hive2://linux01:10000
Connecting to jdbc:hive2://linux01:10000
Enter username for jdbc:hive2://linux01:10000: root
Enter password for jdbc:hive2://linux01:10000: 回车
Connected to: Apache Hive (version 2.3.1)
Driver: Hive JDBC (version 2.3.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://linux01:10000>
启动元数据服务命令:hive --service metastore