sqlarraysdistinctprestotrino

Presto: Aggregate the arrays for all columns


I have the following table:

name   pets
Alex   [dog, cat, bird]
Tom    [rabbit, cat]
Mary   [snake, dog, fish, cow]
Dave   [dog]

I am wondering would it be possible to create a query to find all pets. That is, I am looking for the output like:

[dog, cat, bird, rabbit, snake, fish, cow]

Current approaches is export the table to csv and use python to solve it. I am wondering if this can be done directly in Presto query? Thanks!


Solution

  • Presto has powerful array functions. This should work:

    select array_distinct(flatten(array_agg(pets))) all_pets
    from mytable
    

    Basically this aggregates all arrays to together, as an array of arrays, then concatenates all elements together in a single array, and finally removes duplicates.