Let's assume that I have a table containing four columns
ID_list array(int)
name_list array(string)
profile_scores map(int, array(map(int, float)))
profile_scores2 map(int, array(array(int)))
For instance
ID_list name_list profile_scores profile_scores2
[1, 2] ['A', 'B'] {11: [{0:10.5}, {1:10.2}]} {7: [[4, 5], [7, 8, 9]]}
the key of profile_scores, profile_scores2 columns are unique. There's only one key for those columns. However, the array contains as many elements as ID_list.
Basically in the above case, the below four are equal
Length(ID_list)
Length(name_list)
Length(profile_scores[11])
Length(profile_scores2[7])
I want to split it, so that the each element of ID_list is a separate row. For instance, in the above case
ID_list name_list profile_scores profile_scores2
1 'A' {11: {0:10.5}} {7: [4, 5]}
2 'B' {11: {1:10.2}} {7: [7, 8, 9]}
so the new data types would be
ID_list int
name_list string
profile_scores map(int, map(int, float))
profile_scores2 map(int, array(int))
I feel like CROSS JOIN / UNNEST could work, but I'm not sure how to preserve the key for profile_scores, profile_scores2 columns and just split the array in the value.
Is this operation doable in SQL? If so, how should I apply CROSS JOIN / UNNEST (or some other operations) to complete this?
Presto/Trino allows passing multiple columns to the unnest
:
UNNEST
can also be used with multiple arguments, in which case they are expanded into multiple columns, with as many rows as the highest cardinality argument (the other columns are padded with nulls)
select id, name, profile_scores, profile_scores2
from table
, unnest(id_list, name_list) as t(id, name);
which basically result in the unnested columns flattened based on the index (what seems exactly what you need).
UPD
Presto allows also flattening maps into key-value pairs (tested with Trino):
select id, name, k, v, k1, v1
from table
, unnest(id_list, name_list, profile_scores, profile_scores2) as t(id, name, k, v, k1, v1);