笼统的说,Hive中的Join可分为Common Join(Reduce阶段完成join)和Map Join(Map阶段完成join)。本文简单介绍一下两种join的原理和机制。
一 .Hive Common Join
如果不指定MapJoin或者不符合MapJoin的条件,那么Hive解析器会将Join操作转换成Common Join,即:在Reduce阶段完成join.
整个过程包含Map、Shuffle、Reduce阶段。
Map阶段
读取源表的数据,Map输出时候以Join on条件中的列为key,如果Join有多个关联键,则以这些关联键的组合作为key;
Map输出的value为join之后所关心的(select或者where中需要用到的)列;同时在value中还会包含表的Tag信息,用于标明此value对应哪个表;
按照key进行排序
Shuffle阶段
根据key的值进行hash,并将key/value按照hash值推送至不同的reduce中,这样确保两个表中相同的key位于同一个reduce中
Reduce阶段
根据key的值完成join操作,期间通过Tag来识别不同表中的数据。
以下面的HQL为例,图解其过程:
SELECT
a.id,a.dept,b.age
FROM a join b
ON (a.id = b.id);
二. Hive Map Join
MapJoin通常用于一个很小的表和一个大表进行join的场景,具体小表有多小,由参数 hive.mapjoin.smalltable.filesize来决定,该参数表示小表的总大小,默认值为25000000字节,即25M.
Hive0.7之前,需要使用hint提示* /+ mapjoin(table) */才会执行MapJoin,否则执行Common Join,但在0.7版本之后,默认自动会转换Map Join,由数 hive.auto.convert.join来控制,默认为true.
假设a表为一张大表,b为小表,并且hive.auto.convert.join=true,那么Hive在执行时候会自动转化为MapJoin。
执行流程如下:
1.通过MapReduce Local Task,将小表读入内存,生成HashTableFiles上传至Distributed Cache中,这里会对HashTableFiles进行压缩。
2.MapReduce Job在Map阶段,每个Mapper从Distributed Cache读取HashTableFiles到内存中,顺序扫描大表,在Map阶段直接进行Join,将数据传递给下一个MapReduce任务。
三.测试
准备数据
hive (test)> select * from stu_190802;
stu_190802.id stu_190802.name stu_190802.sex stu_190802.department stu_190802.age
16 xm m 2 25
16 xm m 2 25
16 xm m 2 25
1 zs m 1 18
2 ls m 1 19
3 ww m 1 20
4 zq f 1 18
5 ll f 1 21
6 hl f 1 19
7 xh f 1 20
8 cl f 1 22
9 fj m 1 19
10 wb m 2 23
11 wf f 2 24
12 jj m 2 21
13 yy m 2 20
14 ld f 2 18
15 ch f 2 22
1 zs m 1 17
1 zs m 1 19
hive (test)> select * from department;
department.id department.leader
1 zs
2 wl
默认是ture 会开启小表mapjoin
hive (test)> set hive.auto.convert.join;
hive.auto.convert.join=true
hive (test)> set hive.mapjoin.smalltable.filesize;
hive.mapjoin.smalltable.filesize=25000000
两个表进行关联默认会执行mapjoin
select
t1.*
,t2.leader
from stu_190802 t1
left join department t2
on t1.department =t2.id;
Query ID = finup_20191225144340_00ff1ec6-8d8a-44ca-b92f-08e7b2a1074e
Total jobs = 1
SLF4J: Found binding in [jar:file:/Users/finup/opt/hive3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
============将小表读入内存,生成HashTableFiles====================================
2019-12-25 14:43:54 Dump the side-table for tag: 1 with group count: 2 into file: file:/Users/finup/opt/hive3.1.1/iotmp/4f121d43-3a97-4a8f-aff7-5cc8e02eb80b/hive_2019-12-25_14-43-40_204_201189786632654257-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
2019-12-25 14:43:54 End of local task; Time Taken: 1.186 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
==================只有map没有reduce跟shuffle===============
Job running in-process (local Hadoop)
2019-12-25 14:43:58,652 Stage-3 map = 100%, reduce = 0%
Ended Job = job_local166338284_0073
MapReduce Jobs Launched:
Stage-Stage-3: HDFS Read: 25833140 HDFS Write: 7705 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
t1.id t1.name t1.sex t1.department t1.age t2.leader
16 xm m 2 25 wl
16 xm m 2 25 wl
16 xm m 2 25 wl
1 zs m 1 18 zs
2 ls m 1 19 zs
3 ww m 1 20 zs
4 zq f 1 18 zs
5 ll f 1 21 zs
6 hl f 1 19 zs
7 xh f 1 20 zs
8 cl f 1 22 zs
9 fj m 1 19 zs
10 wb m 2 23 wl
11 wf f 2 24 wl
12 jj m 2 21 wl
13 yy m 2 20 wl
14 ld f 2 18 wl
15 ch f 2 22 wl
1 zs m 1 17 zs
1 zs m 1 19 zs
Time taken: 18.452 seconds, Fetched: 20 row(s)
explian查看执行计划
explain
select t1.*,t2.leader
from stu_190802 t1
left join department t2
on t1.department =t2.id;
hive (test)> explain select t1.*,t2.leader from stu_190802 t1 left join department t2 on t1.department =t2.id;
OK
==================生成了3个stage======================
Explain
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-3 depends on stages: Stage-4
Stage-0 depends on stages: Stage-3
STAGE PLANS:
Stage: Stage-4
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_1:t2
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_1:t2
TableScan
alias: t2
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: int), leader (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: NONE
HashTable Sink Operator
keys:
0 UDFToDouble(_col3) (type: double)
1 UDFToDouble(_col0) (type: double)
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 20 Data size: 229 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: bigint), name (type: string), sex (type: string), department (type: string), age (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 20 Data size: 229 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
Left Outer Join 0 to 1
keys:
0 UDFToDouble(_col3) (type: double)
1 UDFToDouble(_col0) (type: double)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col6
Statistics: Num rows: 22 Data size: 251 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col0 (type: bigint), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: bigint), _col6 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Statistics: Num rows: 22 Data size: 251 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 22 Data size: 251 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: vectorized
Local Work:
Map Reduce Local Work
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Time taken: 0.329 seconds, Fetched: 65 row(s)
关闭自动mapjoin
set hive.auto.convert.join=false;
hive (test)> select t1.*,t2.leader from stu_190802 t1 left join department t2 on t1.department =t2.id;
Query ID = finup_20191225144559_4682d858-ef03-4189-b048-031b4659f7ac
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
===================有map 也有reduce操作======================
2019-12-25 14:46:01,259 Stage-1 map = 100%, reduce = 100%
Ended Job = job_local751812336_0074
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 129166985 HDFS Write: 38525 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
t1.id t1.name t1.sex t1.department t1.age t2.leader
1 zs m 1 19 zs
1 zs m 1 17 zs
9 fj m 1 19 zs
8 cl f 1 22 zs
7 xh f 1 20 zs
6 hl f 1 19 zs
5 ll f 1 21 zs
4 zq f 1 18 zs
3 ww m 1 20 zs
2 ls m 1 19 zs
1 zs m 1 18 zs
15 ch f 2 22 wl
14 ld f 2 18 wl
13 yy m 2 20 wl
12 jj m 2 21 wl
11 wf f 2 24 wl
10 wb m 2 23 wl
16 xm m 2 25 wl
16 xm m 2 25 wl
16 xm m 2 25 wl
Time taken: 1.718 seconds, Fetched: 20 row(s)
查看关闭mapjoin后的执行计划
hive (test)> explain select t1.*,t2.leader from stu_190802 t1 left join department t2 on t1.department =t2.id;
OK
==================生成了2个stage======================
Explain
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 20 Data size: 229 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: bigint), name (type: string), sex (type: string), department (type: string), age (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 20 Data size: 229 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: UDFToDouble(_col3) (type: double)
sort order: +
Map-reduce partition columns: UDFToDouble(_col3) (type: double)
Statistics: Num rows: 20 Data size: 229 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: bigint)
TableScan
alias: t2
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: id (type: int), leader (type: string)
outputColumnNames: _col0, _col1
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: UDFToDouble(_col0) (type: double)
sort order: +
Map-reduce partition columns: UDFToDouble(_col0) (type: double)
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string)
Reduce Operator Tree:
Join Operator
condition map:
Left Outer Join 0 to 1
keys:
0 UDFToDouble(_col3) (type: double)
1 UDFToDouble(_col0) (type: double)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col6
Statistics: Num rows: 22 Data size: 251 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: _col0 (type: bigint), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: bigint), _col6 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Statistics: Num rows: 22 Data size: 251 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 22 Data size: 251 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
commen join 时map段会有sort order: + 排序的功能,但是在mapjoin 时map 段不会sort
总结
1.mapjoin 可以看做 boardcast join 就是将小表的数据加载到内存中并且没有shuffle过程,加快处理效率,但是这样如果数据量过大,加载到内存有可能会引起OOM
2.普通join 会产生shuffle,会影响效率(数据传输);也可能产生数据倾斜(一个key太多,那任务处理就会很慢)