I have a HIVE
table as follows:
select id, id_2, val from test order by id;
234 974 0.5
234 457 0.7
234 236 0.5
234 859 0.6
123 859 0.7
123 236 0.6
123 974 0.5
123 457 0.5
I'm trying to collect
the data based on the id
value. I need the collected data to be following the same order for each row. My expected output is as follows: (any order is fine as long it's the same for all the rows):
234 [974,457,236,859] [0.5,0.7,0.5,0.6]
123 [974,457,236,859] [0.5,0.5,0.6,0.7]
I used the collect
UDF from Brickhouse.
select tmp.id, collect(id_2), collect(tmp.val) from
(select id, id_2, val from test
order by id) tmp
group by tmp.id
;
234 [974,457,236,859] [0.5,0.7,0.5,0.6]
123 [859,236,974,457] [0.7,0.6,0.5,0.5]
As you can see, the order of the columns is not being kept. Is there any way to keep the ordering constant throughout the output? Any tips will be appreciated.
Use this query
select tmp.id, collect(id_2), collect(tmp.val) from
(select id, id_2, val from test
order by id desc, id_2 desc) tmp
group by tmp.id
;
Output as below,
234 [974,457,236,859] [0.5,0.7,0.5,0.6]
123 [974,457,236,859] [0.5,0.5,0.6,0.7]
Basically modified
order by id
to
order by id desc, id_2 desc