hivehiveqlcuberollupgrouping-sets

How to reshape data after GROUPING SETS in Hive?


I would like to aggregate a column over many different dimensions. I think GOUPING SETS would be appropriate to my problem, but I cannot figure out how to transform/reshape the resulting table from GROUPING SETS.

This is my query using GROUPING SETS:

select date, dim1, dim2, dim3, sum(value) as sum_value
from table
grouping by date, dim1, dim2, dim3
grouping sets ((date, dim1), (date, dim2), (date, dim3))

The query would result in a table like this:

date        dim1    dim2    dim3    sum_value
2017-01-01  A       NULL    NULL    [value_A]
2017-01-01  B       NULL    NULL    [value_B]
2017-01-01  NULL    C       NULL    [value_C]
2017-01-01  NULL    D       NULL    [value_D]
2017-01-01  NULL    NULL    E       [value_E]
2017-01-01  NULL    NULL    F       [value_F]

But what I really need is a table like this:

date        dim     factor  sum_value
2017-01-01  dim1     A      [value_A]
2017-01-01  dim1     B      [value_B]
2017-01-01  dim2     C      [value_C]
2017-01-01  dim2     D      [value_D]
2017-01-01  dim3     E      [value_E]
2017-01-01  dim3     F      [value_F]

The actual number of dimensions is far more than 3, so it wouldn't be a good idea to hard-code the query. Is there a way to reshape the table from grouping sets or other aggregation methods to get the desired table?

Thanks!


Solution

  • select    `date`
             ,elt(log2(GROUPING__ID - 1),'dim1','dim2','dim3')      as dim
             ,coalesce (dim1,dim2,dim3)                             as factor
             ,sum(value)                                            as sum_value
    
    from      `table`
    
    group by  `date`,dim1,dim2,dim3
              grouping sets ((`date`,dim1),(`date`,dim2),(`date`,dim3))