Hive的内置数据类型可以分为两大类:
- 基础数据类型;
- 复杂数据类型;
1、基础数据类型包括:
数据类型 | 所占字节 | 开始支持版本 |
TINYINT | 1byte,-128 ~ 127 |
|
SMALLINT | 2byte,-32,768 ~ 32,767 |
|
INT | 4byte,-2,147,483,648 ~ 2,147,483,647 |
|
BIGINT | 8byte,-9,223,372,036,854,775,808 ~ 9,223,372,036,854,775,807 |
|
BOOLEAN |
|
|
FLOAT | 4byte单精度 |
|
DOUBLE | 8byte双精度 |
|
STRING |
|
|
BINARY |
| 从Hive0.8.0开始支持 |
TIMESTAMP |
| 从Hive0.8.0开始支持 |
DECIMAL |
| 从Hive0.11.0开始支持 |
CHAR |
| 从Hive0.13.0开始支持 |
VARCHAR |
| 从Hive0.12.0开始支持 |
DATE |
| 从Hive0.12.0开始支持 |
2、复杂数据类型:
- ARRAY:ARRAY类型是由一系列相同数据类型的元素组成,这些元素可以通过下标来访问。比如有一个ARRAY类型的变量fruits,它是由['apple','orange','mango']组成,那么我们可以通过fruits[1]来访问元素orange,因为ARRAY类型的下标是从0开始的;
- MAP:MAP包含key->value键值对,可以通过key来访问元素。比如”userlist”是一个map类型,其中username是key,password是value;那么我们可以通过userlist['username']来得到这个用户对应的password;
- STRUCT:STRUCT可以包含不同数据类型的元素。这些元素可以通过”点语法”的方式来得到所需要的元素,比如user是一个STRUCT类型,那么可以通过user.address得到这个用户的地址。
- UNION: UNIONTYPE,他是从Hive 0.7.0开始支持的。
3、hive array使用:
1)创建表:
hive> createtable test(name string,col1 array<string>)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> COLLECTIONITEMS TERMINATED BY ','
> STORED AS TEXTFILE;
2)导入数据:
$ cat test
test 20,21
liuxiao 30,60,90
xiaoli 29,69,89
导入:
hive> load datalocal inpath '/home/qytt/test' overwrite into table test;
Copying data from file:/home/qytt/test
Copying file: file:/home/qytt/test
Loading data totable qytt.test
OK
Time taken: 0.305seconds
3)查询:
hive> select *from test;
OK
test ["20","21"]
liuxiao ["30","60","90"]
xiaoli ["29","69","89"]
Time taken: 0.06seconds, Fetched: 3 row(s)
hive> selectname,col1[0] from test;
Total jobs = 1
...
OK
test 20
liuxiao 30
xiaoli 29
hive> selectname,col1[2] from test;
Total jobs = 1
...
OK
test NULL
liuxiao 90
xiaoli 89
4、hive map使用:
1)创建表:
hive> createtable test(name string,col1 map<string,string>)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> COLLECTIONITEMS TERMINATED BY ','
> MAP KEYS TERMINATED BY ':'
> STORED AS TEXTFILE;
2)导数据:
$ cat test
test age:20,sex:1,addredd:beijingshi
liuxiao age:30,job:abc
xiaoli age:28,sex:0,intrest:happy
hive> load datalocal inpath '/home/qytt/test' overwrite into table test;
Copying data from file:/home/qytt/test
Copying file: file:/home/qytt/test
Loading data totable qytt.test
OK
Time taken: 0.745seconds
3)查询:
hive> select *from test;
OK
test {"age":"20","sex":"1","addredd":"beijingshi"}
liuxiao {"age":"30","job":"abc"}
xiaoli {"age":"28","sex":"0","intrest":"happy"}
Time taken: 0.045seconds, Fetched: 3 row(s)
hive> selectname,col1['age'],col1['sex'] from test;
Total jobs = 1
...
OK
test 20 1
liuxiao 30 NULL
xiaoli 28 0
Time taken: 26.358seconds, Fetched: 3 row(s)
5、hive struct使用:
1)创建表:
hive> createtable test(name string,col1 struct<age:int,sex:int,address:string>)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> COLLECTION ITEMS TERMINATED BY ','
> STORED AS TEXTFILE;
OK
Time taken: 0.043seconds
2)导入数据:
$ cat test
test 20,1,beijingshi
liuxiao 30,0,abc
xiaoli 28,0,happy
hive> load datalocal inpath '/home/qytt/test' overwrite into table test;
Copying data from file:/home/qytt/test
Copying file: file:/home/qytt/test
Loading data totable qytt.test
OK
Time taken: 0.186seconds
3)查询:
hive> select *from test;
OK
test {"age":20,"sex":1,"address":"beijingshi"}
liuxiao {"age":30,"sex":0,"address":"abc"}
xiaoli {"age":28,"sex":0,"address":"happy"}
Time taken: 0.028seconds, Fetched: 3 row(s)
hive> selectname,col1.age,col1.address from test;
Total jobs = 1
...
OK
test 20 beijingshi
liuxiao 30 abc
xiaoli 28 happy
Time taken: 22.054seconds, Fetched: 3 row(s)