Hive的内置数据类型可以分为两大类:

  • 基础数据类型;
  • 复杂数据类型;

1、基础数据类型包括:


数据类型



所占字节



开始支持版本



TINYINT



1byte,-128 ~ 127



 



SMALLINT



2byte,-32,768 ~ 32,767



 



INT



4byte,-2,147,483,648 ~ 2,147,483,647



 



BIGINT



8byte,-9,223,372,036,854,775,808 ~ 9,223,372,036,854,775,807



 



BOOLEAN



 



 



FLOAT



4byte单精度



 



DOUBLE



8byte双精度



 



STRING



 



 



BINARY



 



从Hive0.8.0开始支持



TIMESTAMP



 



从Hive0.8.0开始支持



DECIMAL



 



从Hive0.11.0开始支持



CHAR



 



从Hive0.13.0开始支持



VARCHAR



 



从Hive0.12.0开始支持



DATE



 



从Hive0.12.0开始支持


 

2、复杂数据类型:

  1. ARRAY:ARRAY类型是由一系列相同数据类型的元素组成,这些元素可以通过下标来访问。比如有一个ARRAY类型的变量fruits,它是由['apple','orange','mango']组成,那么我们可以通过fruits[1]来访问元素orange,因为ARRAY类型的下标是从0开始的;
  2. MAP:MAP包含key->value键值对,可以通过key来访问元素。比如”userlist”是一个map类型,其中username是key,password是value;那么我们可以通过userlist['username']来得到这个用户对应的password;
  3. STRUCT:STRUCT可以包含不同数据类型的元素。这些元素可以通过”点语法”的方式来得到所需要的元素,比如user是一个STRUCT类型,那么可以通过user.address得到这个用户的地址。
  4. UNION: UNIONTYPE,他是从Hive 0.7.0开始支持的。

 

3、hive array使用

1)创建表:

hive> createtable test(name string,col1 array<string>)

    > ROW FORMAT DELIMITED

    > FIELDS TERMINATED BY '\t'

    > COLLECTIONITEMS TERMINATED BY ','

    > STORED AS TEXTFILE;

 

2)导入数据:

$ cat test

test        20,21

liuxiao        30,60,90

xiaoli        29,69,89

导入:

hive> load datalocal inpath '/home/qytt/test' overwrite into table test;

Copying data from file:/home/qytt/test

Copying file: file:/home/qytt/test

Loading data totable qytt.test

OK

Time taken: 0.305seconds

 

3)查询:

hive> select *from test;

OK

test        ["20","21"]

liuxiao        ["30","60","90"]

xiaoli        ["29","69","89"]

Time taken: 0.06seconds, Fetched: 3 row(s)

 

hive> selectname,col1[0] from test;

Total jobs = 1

...

OK

test        20

liuxiao        30

xiaoli        29

 

hive> selectname,col1[2] from test;

Total jobs = 1

...

OK

test        NULL

liuxiao        90

xiaoli        89

 

4、hive  map使用:

1)创建表:

hive> createtable test(name string,col1 map<string,string>)

    > ROW FORMAT DELIMITED

    > FIELDS TERMINATED BY '\t'

    > COLLECTIONITEMS TERMINATED BY ','

    > MAP KEYS TERMINATED BY ':'

    > STORED AS TEXTFILE;

2)导数据:

$ cat test

test        age:20,sex:1,addredd:beijingshi

liuxiao        age:30,job:abc

xiaoli        age:28,sex:0,intrest:happy

 

hive> load datalocal inpath '/home/qytt/test' overwrite into table test;

Copying data from file:/home/qytt/test

Copying file: file:/home/qytt/test

Loading data totable qytt.test

OK

Time taken: 0.745seconds

 

3)查询:

hive> select *from test;

OK

test        {"age":"20","sex":"1","addredd":"beijingshi"}

liuxiao        {"age":"30","job":"abc"}

xiaoli        {"age":"28","sex":"0","intrest":"happy"}

Time taken: 0.045seconds, Fetched: 3 row(s)

 

hive> selectname,col1['age'],col1['sex'] from test;

Total jobs = 1

...

OK

test        20        1

liuxiao        30        NULL

xiaoli        28        0

Time taken: 26.358seconds, Fetched: 3 row(s)

 

5、hive struct使用:

1)创建表:

hive> createtable test(name string,col1 struct<age:int,sex:int,address:string>)

    > ROW FORMAT DELIMITED

    > FIELDS TERMINATED BY '\t'

    > COLLECTION ITEMS TERMINATED BY ','

    > STORED AS TEXTFILE;

OK

Time taken: 0.043seconds

 

2)导入数据:

$ cat test

test        20,1,beijingshi

liuxiao        30,0,abc

xiaoli        28,0,happy

 

hive> load datalocal inpath '/home/qytt/test' overwrite into table test;      

Copying data from file:/home/qytt/test

Copying file: file:/home/qytt/test

Loading data totable qytt.test

OK

Time taken: 0.186seconds

 

3)查询:

hive> select *from test;

OK

test        {"age":20,"sex":1,"address":"beijingshi"}

liuxiao        {"age":30,"sex":0,"address":"abc"}

xiaoli        {"age":28,"sex":0,"address":"happy"}

Time taken: 0.028seconds, Fetched: 3 row(s)

 

hive> selectname,col1.age,col1.address from test;

Total jobs = 1

...

OK

test        20        beijingshi

liuxiao        30        abc

xiaoli        28        happy

Time taken: 22.054seconds, Fetched: 3 row(s)