hive 查看建表yuju hive 查询建表

转载

数据解码者 2023-07-20 18:26:42

文章标签 hive 查看建表yuju HIVE创建 HIVE hive 外部表 文章分类 Hive 大数据

背景：

小白最近开始接触hadoop生态中的hive，于是乎就开始了胡胡咧咧的hive学习史

建表：

hive 建表分为内部表(没有external)和外部表 (有external)

数据准备：

1、通过 vi /home/hive/mytest01.txt创建数据文件，内容如下：

1,xiaoming,book-TV-code,beijing:chaoyang-shagnhai:pudong

2、通过 hdfs dfs -put /home/hive/mytest01.txt /tmp/hive/ 将上述准备的数据上传到hdfs

hive 查看建表yuju hive 查询建表_hive

3、可以通过 hdfs dfs -cat /home/hive/mytest01.txt 来查看hdfs中的文件

建表：

1、可以通过beeline 工具来操作hive(该工具在hive1.14开始已集成)

Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u <database url>               the JDBC URL to connect to
   -n <username>                   the username to connect as
   -p <password>                   the password to connect as
   -d <driver class>               the driver class to use
   -i <init file>                  script file for initialization
   -e <query>                      query that should be executed
   -f <exec file>                  script file that should be executed
   -w (or) --password-file <password file>  the password file to read password from
   --hiveconf property=value       Use value for given property
   --hivevar name=value            hive variable name and value
                                   This is Hive specific settings in which variables
                                   can be set at session level and referenced in Hive
                                   commands or queries.
   --color=[true/false]            control whether color is used for display
   --showHeader=[true/false]       show column names in query results
   --headerInterval=ROWS;          the interval between which heades are displayed
   --fastConnect=[true/false]      skip building table/column list for tab-completion
   --autoCommit=[true/false]       enable/disable automatic transaction commit
   --verbose=[true/false]          show verbose error messages and debug info
   --showWarnings=[true/false]     display connection warnings
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]        format numbers using DecimalFormat pattern
   --force=[true/false]            continue running script even after errors
   --maxWidth=MAXWIDTH             the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTH    the maximum width to use when displaying columns
   --silent=[true/false]           be more silent
   --autosave=[true/false]         automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for result display
                                   Note that csv, and tsv are deprecated - use csv2, tsv2 instead
  --truncateTable=[true/false]    truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER     specify the delimiter for delimiter-separated values output format (default: |)
   --isolation=LEVEL               set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of printing null as empty string
   --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side
   --addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in the beeline client side
   --help                          display this message

2、采用命令 beeline -r -n hive -p hive 进入hive CUL操作界面 (鉴于记录jdbc url 地址复杂，可以直接通过 beeline 执行一次后使用该命令，可以省略记录地址的问题，因为该命令中的-r 就是使用上一次链接时的url)：

hive 查看建表yuju hive 查询建表_外部表_02

建内部表：

1、 create table mytable(id int,name string,hobby array<string>,add map<string,string>) row format delimited fields terminated by ',' collection items terminated by '-' map keys terminated by ':' STORED AS TEXTFILE ;

hive 查看建表yuju hive 查询建表_HIVE_03

2、通过 select * from mytable;

hive 查看建表yuju hive 查询建表_HIVE_04

3、通过 load data -inpath /tmp/hive/mytest01.txt OVERWRITE INTO TABLE mytable; 导入数据

4、导入成功后通过步揍2查询

hive 查看建表yuju hive 查询建表_HIVE创建_05

5、通过 desc formatted mytable4 ;查看明细

hive 查看建表yuju hive 查询建表_hive 查看建表yuju_06

建外部表：

1、create external table mytable4(id int,name string,hobby array<string>,add map<string,string>) row format delimited fields terminated by ',' collection items terminated by '-' map keys terminated by ':' location '/tmp/hive/mytabl' ;

2、通过load data inpath '/tmp/hive/mytest01.txt' OVERWRITE INTO TABLE mytable4; 导入数据.

hive 查看建表yuju hive 查询建表_hive 查看建表yuju_07

3、外部表在hdfs中的存储。

hive 查看建表yuju hive 查询建表_hive 查看建表yuju_08

4、通过 desc 查看desc formatted mytable4；

hive 查看建表yuju hive 查询建表_HIVE_09

区别：

未被external修饰的是内部表（managed table），被external修饰的为外部表（external table）；
区别：
内部表数据由Hive自身管理，外部表数据由HDFS管理；
内部表数据存储的位置是hive.metastore.warehouse.dir（默认：/user/hive/warehouse），外部表数据的存储位置由自己制定；
删除内部表会直接删除元数据（metadata）及存储数据；删除外部表仅仅会删除元数据，HDFS上的文件并不会被删除；
对内部表的修改会将修改直接同步给元数据，而对外部表的表结构和分区进行修改，则需要修复（MSCK REPAIR TABLE table_name;）

比对：

1、内部表drop table mytable 后就会被彻底删除，外部表则不会。

2、通过desc formatted 表名;查看表结构信息会发现table_type 不同。

3、外部表删除后重建数据自动回复（因为没有删除HDFS上的文件）

通过查询来建表:

1、通过 AS 完成建表，

select * from mytable5;

hive 查看建表yuju hive 查询建表_HIVE创建_10

create table mytable6 as select id,name from mytable5;

hive 查看建表yuju hive 查询建表_hive_11

通过like来建表

会创建结构完成的空表

create table mytable7 like mytable6;

hive 查看建表yuju hive 查询建表_hive_12

遇到的问题：

问题1：

hive sequencefile导入文件遇到FAILED: SemanticException Unable to load data to destination table. Error: The file that you are trying to load does not match the file format of the destination table

原因

这是因为SequenceFile的表不能使用load来加载数据，只能导入sequence类型的数据

解决办法

创建一张新表，

create table mytable3(id int,name string,hobby array<string>,add map<string,string>) row format delimited fields terminated by ',' collection items terminated by '-' map keys terminated by ':' STORED AS TEXTFILE/STORED AS ORC

问题2：

执行一次load data 后再次执行会报错

原因：

执行后hdfs中的文件会被删除

解决方向：

hdfs dfs -put /home/hive/....txt /tem/hive/ 上传文件后执行即可解决。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。