一、 静态分区
- 单个分区
a. 创建分区表 : `create table par_tab(name string,nation string) partitioned by(gender string) row format delimited fields terminated by ',';`
b. 加载数据:`load data local inpath '/home/hduser/software/hive_1.2.1/hive_data/test1.txt' into table par_tab partition(gender='boy');`
c. 查询数据:`select * from par_tab;`
d. 按分区查 : `select * from par_tab where gender='boy';`
e. 联合查询操作:`select * from par_tab where gender='boy' union select * from par_tab where gender='girl';` *用了mapReduce,结果也是自动排序了*
- 多个分区
a. 创建多分区的表:`create table 表名(name string,nation string) partitioned by(gender string,year string)row format delimited fields terminated by ',';`
b. 查看分区表信息: `desc par_mul_tab;`
c. 加载数据:`load data local inpath '/home/hduser/software/hive_1.2.1/hive_data/test1.txt' into table par_mul_tab partition(gender='girl',year='2018');`
hdfs上存储数据的位置
d. 查询数据:
select * from par_mul_tab where gender='girl' and year='2018';
二、 动态分区
- 单分区
a. 创建普通表作为临时表:`create table par_dyn_tab as select * from par_tab;`(克隆一个静态分区表,不会把分区的信息克隆,只是单纯的把字段克隆)然后把数据删掉:`truncate table xxx`
克隆前后的表信息对照:
b. 给临时表加载数据:`load data local inpath '/home/hduser/software/hive_1.2.1/hive_data/test1.txt' into table par_dyn_tab;`
c. 为分区表动态加载数据:`insert overwrite table par_tab partition(gender) select name,nation,gender from par_dyn_tab;` //动态分区往静态分区表中加载数据,分区的字段是查询的最后的字段
此时的错误信息提示可以更改:set hive.exec.dynamic.partition.mode=nonstrict;
- 多分区
a. 创建临时表: `create table par_mul_dyn (name string,nation string,gender string,year string) row format delimited fields terminated by ',';`
b. 加载数据:`load data local inpath'/home/hduser/software/hive_1.2.1/hive_data/test3' into table par_mul_dyn;`
c. 像分区表中加载数据:insert overwrite table par_mul_tab partition(gender='woman',year) select name,nation,year from par_mul_dyn;
//分区的字段顺序:静态分区必须放在动态分区前面,都是动态分区无要求
三、 分区的修改
- 创建分区表:
create table dept_partition(dept_id int,dept_name string) partitioned by(year int)row format delimited fields terminated by '\t';
- 加载数据:
load data local inpath'/home/hduser/software/hive_1.2.1/hive_data/dept.txt' into table dept_partition partition(year='2020');
- 动态添加分区:
alter table dept_partition add partition(year='2019');
添加多个分区:
中间以空格分割,新添加的分区不能存在
- 删除分区:
alter table dept_partition drop partition(year='2018');
删除多个分区:alter table dept_partition drop partition(year='2018'),partition(year='2021');
中间以逗号分割,删除不存在的分区不会报错
- 查看分区表结构:
desc formatted dept_partition;
- 查看分区表有多少个分区:
show partitions dept_partition;