Hive多维分析函数——With cube、Grouping sets、With rollup
创始人
2024-12-01 07:04:50

有些指标涉及【多维度】的聚合,大的汇总维度,小的明细维度,需要精细化的下钻。

  • grouping sets: 多维度组合,组合维度自定义;
  • with cube: 多维度组合,程序自由组合,组合为各个维度的笛卡尔积;
  • with rollup:是 with cube的子集,以左侧维度为主,即不允许左侧为NULL,右侧为非NULL的情况出现

一、grouping sets

 0 数据准备

with test1 as      (select '2021-08' as month_date,'2021-08-11' as day_date,10 as pv       union all       select '2021-08' as month_date,'2021-08-10' as day_date,15 as pv       union all       select '2021-08' as month_date,'2021-08-08' as day_date,35 as pv       union all       select '2021-07' as month_date,'2021-07-08' as day_date,35 as pv       union all       select '2021-07' as month_date,'2021-07-06' as day_date,25 as pv       union all       select '2021-07' as month_date,'2021-07-03' as day_date,15 as pv)       select          month_date,         day_date,         sum(pv) as pv      from test1       group by month_date,day_date       grouping sets        (           (),           (month_date),           (month_date,day_date),           (day_date)         )       order by month_date,day_date; 

 1 结果分析

二、with cube

 0 数据准备

with test1 as (select '2021-08' as month_date,'2021-08-11' as day_date,10 as pv  union all  select '2021-08' as month_date,'2021-08-10' as day_date,15 as pv  union all  select '2021-08' as month_date,'2021-08-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-06' as day_date,25 as pv  union all  select '2021-07' as month_date,'2021-07-03' as day_date,15 as pv)    select     month_date,    day_date,    sum(pv) as pv  from test1  group by month_date,day_date  with cube  order by month_date,day_date;

 1 结果分析

2 总结

  • 从with cube和grouping sets的案例可以看出,两个结果是一样的;
  • with cube的维度组合(各个维度的笛卡尔积)就是groupingsets里面手动添加的维度,即为month_date,day_date两个维度的笛卡尔积。维度组合即为:()、 (month_date)、 (month_date,day_date)、 (day_date)

三、 with rollup

 0 数据准备

with test1 as (select '2021-08' as month_date,'2021-08-11' as day_date,10 as pv  union all  select '2021-08' as month_date,'2021-08-10' as day_date,15 as pv  union all  select '2021-08' as month_date,'2021-08-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-06' as day_date,25 as pv  union all  select '2021-07' as month_date,'2021-07-03' as day_date,15 as pv)  select    month_date,   day_date,   sum(pv) as pv from test1 group by month_date,day_date with rollup order by month_date,day_date;

 1 结果分析

 2 总结

   从结果上可以看出,with rollup 和with cube的区别是,少了day_date这个单独维度的聚合,因为with rollup是以左侧维度为主,当左侧month_date维度为NULL时,右侧day_date维度必须为NULL。

参考文章 :

https://zhuanlan.zhihu.com/p/631268351

https://blog.51cto.com/u_14555/6696007

相关内容

热门资讯

四部门印发行动方案,促进人工智... 近日,国家能源局会同国家发展改革委、工业和信息化部、国家数据局印发《关于促进人工智能与能源双向赋能的...
瞄准8时13分!一起看发射场准... 我国将于5月11日8时13分发射天舟十号货运飞船。目前,长征七号遥十一运载火箭已完成推进剂加注。 距...
天舟十号货运飞船点火发射 今天上午,搭载天舟十号货运飞船的长征七号遥十一运载火箭,在我国文昌航天发射场点火发射。
原创 微... 前段时间 iOS 微信发布了 8.0.73 正式版,安卓微信发布了 8.0.72 测试版,这次的更新...
任正非罕见出镜:华为芯片基础技... 感谢IT之家网友 的线索投递! 5 月 10 日消息,在 5 月 8 日播出的《新闻联播》节目中,...