hive sql分区和条件优化

2/22/2017来源:ASP.NET技巧人气:3933

分区过滤

如果不加分区,默认会扫描整个表的数据 如何查看表有哪些分区:show partitions databaseName.tableName 如何确认分区是否生效:explain dependency sql

分区放置位置

普通查询,分区过滤放在where后面,如 select * from table1 t1 where t1.date between '20151205' and '20151206' 说明:var between ‘a’ and ‘b’意思是var>=’a’ and var<=’b’ inner join,分区过滤放在where后面,如 select * from table1 t1 join table t2 on (t1.id=t2.id) where t1.date between '20151205' and '20151206' and t2.date between '20151205' and '20151206' left join,左边表的分区过滤放在where后面,右边表分区过滤放在on后面,如 select * from table1 t1 left join table t2 on (t1.id=t2.id and t2.date between '20151205' and '20151206') where t1.date between '20151205' and '20151206' 说明:right join相反

除了分区条件的放置之外其他的条件也类似,如t2.order_type=’3’放置在where后面则是在join之后进行过滤,放在on后面则是在join之前过滤