sqlpostgresqlaggregate-functionshstore

How to find pairwise all combination in a Postgres hstore column?


I have a table in Postgres with an hstore column like:

"surface => concrete, ramp => yes, incline => up, highway => footway"
"surface => asphalt, lit => no, source => survey, incline => 12.6%, highway => footway"
"bicycle => yes, surface => asphalt, highway => footway, segregated => no"

Now I would like to analyse the keys among themselves. Therefore I want to know for the whole dataset how often the keys occur in combination pairwise.

The result should look something like this:

|      Key 1          |     Key 2        |       Count      |
|---------------------|------------------|------------------|
|      surface        |    source        |         1        |
|      surface        |    highway       |         3        |
|      surface        |    incline       |         2        |
|      highway        |    bicycle       |         1        |
.....

Solution

  • Use the function akeys() to get all keys of the hstore column as an array, generate all distinct pairs of the arrays elements with generate_subscripts() and group the result by the pairs:

    select akeys[i] as key1, akeys[j] as key2, count(*)
    from my_table
    cross join akeys(hstore_col)
    cross join generate_subscripts(akeys, 1) as i
    cross join generate_subscripts(akeys, 1) as j
    where akeys[i] > akeys[j]
    group by 1, 2
    order by 1 desc, 2 desc
    

    Test it in db<>fidlle.