sqlhive

coalesce first non-null, varying null positions in hierarchical table sql


I have a table with varying depths of nulll values :

dim col1 col2 col3 col4
v c b a NULL
x a NULL NULL NULL
y b a NULL NULL
z d c b a

Does anyone know how I can write a select statement to reverse & collapse the cols by NULL values? Take the first rightmost value as the new leftmost value, achieving this transformation :

dim col1 col2 col3 col4
v a b c NULL
x a NULL NULL NULL
y a b NULL NULL
z a b c d

Solution

  • You can concatenate all columns with some delimiter, substituting NULLs with some constant like 'null', extract null part, reverse not null part, concatenate again reversed not null with nulls, split to get an array, and now you can get arrays elements as columns, do not forget to check for nulls and reverse each element to get values longer than 1 character correctly:

    with 
    test_data as (
    select 'v' as dim, 'c' as col1, 'b' as col2,    'a' as col3,    NULL as col4 union all
    select 'x', 'a', NULL,  NULL ,  NULL  union all
    select 'y', 'b', 'a',   NULL ,  NULL  union all
    select 'z', 'd', 'c',   'b',    'a'
    )
    
    
    select dim,
           case when reversed_array[0] = 'null' then null else reverse(reversed_array[0]) end as col1,
           case when reversed_array[1] = 'null' then null else reverse(reversed_array[1]) end as col2,
           case when reversed_array[2] = 'null' then null else reverse(reversed_array[2]) end as col3,
           case when reversed_array[3] = 'null' then null else reverse(reversed_array[3]) end as col4     
    from
    (
    select dim,
    split(
    concat(
    --all before |null
    reverse(regexp_replace(concat_ws('|',nvl(col1,'null'), nvl(col2,'null'), nvl(col3,'null'), nvl(col4,'null')),'(\\|null)*','')),
    --NULLs
    regexp_extract(concat_ws('|',nvl(col1,'null'), nvl(col2,'null'), nvl(col3,'null'), nvl(col4,'null')),'((\\|null)*)$',1)
    )
    ,'\\|') as reversed_array
    from test_data
    ) s
    

    Result:

    dim     col1    col2    col3    col4    
    v       a       b       c       NULL
    x       a       NULL    NULL    NULL
    y       a       b       NULL    NULL
    z       a       b       c       d