So imagine I have a table with a jsonb column data
(for the sake of example, not because it's a good column name) that contains an array of json objects with a key states
that is an array of USA state abbreviations:
[ { states: ['AZ', 'CA'], ... }, { states: ['NY','CO'], ... }, ... ]
I'd like to aggregate all the states in the JSON objects into a list of distinct states and make it into a generated stored column states
. (e.g. ['AZ', 'CA', 'NY', 'CO', ...]
)
One not great way to select
such a list is:
select jsonb_path_query_array(
( select jsonb_agg(b) from
(select distinct jsonb_array_elements(a) as state from jsonb_array_elements(
jsonb_path_query_array(data, '$[*].states')
) as a) as b
), '$[*].state') as states
from myTable
This (a) seems overly complicated for something as simple as mapping an array by the states
key and calling whatever the PG equivalent of flatten
is, and (b) has select
in it, and so can't be used for generated columns.
The other ways I can imagine doing it all involve lateral joins.
Given that all the data is in a column, it seems like it should be possible to do some kind of transformation on it without selects or joins for creating a generated stored column. But I can't seem to figure out how to do this without using select
and/or join lateral
.
Is there a way to do this for a generated stored column in PostgreSQL?
(Currently on PG 16.x)
This seems overly complicated for something as simple as mapping an array by the states key and calling whatever the PG equivalent of flatten is
It is. You don't need jsonb_path_query_array
, just use jsonb_path_query
which returns setof jsonb
and a path that leads to each string itself:
SELECT jsonb_agg(DISTINCT state) FROM jsonb_path_query(data, '$[*].states[*]')
If you prefer jsonb_array_elements
over json paths, it would be a lateral function call:
SELECT jsonb_agg(DISTINCT state)
FROM jsonb_array_elements(data) obj,
jsonb_array_elements(obj->'states') state
This has
select
in it, and so can't be used for generated columns.
I think that is unavoidable (at least if you need the distinct
- otherwise jsonb_path_query_array(data, '$[*].states[*]')
would do). However you can always write a custom function, do the json manipulation inside, mark it as immutable
, and use it for your generated column.