I am (still) new to postgresql and jsonb. I am trying to select some records from a subquery and am stuck. My data column looks like this (jsonb):
{"people": [{"age": "50", "name": "Bob"}], "another_key": "no"}
{"people": [{"age": "73", "name": "Bob"}], "another_key": "yes"}
And here is my query. I want to select all names that are "Bob" whose age is greater than 30:
SELECT * FROM mytable
WHERE (SELECT (a->>'age')::float
FROM (SELECT jsonb_array_elements(data->'people') as a
FROM mytable) as b
WHERE a @> json_object(ARRAY['name', 'Bob'])::jsonb
) > 30;
I get the error:
more than one row returned by a subquery used as an expression
I don't quite understand. If I do some simple substitution (just for testing) I can do this:
SELECT * FROM mytable
WHERE (50) > 30 -- 50 is the age of the youngest Bob
and that returns both rows.
The error means just what it says:
more than one row returned by a subquery used as an expression
The expression in the WHERE
clause expects a single value (just like you substituted in your added test), but your subquery returns multiple rows. jsonb_array_elements()
is a set-returning function.
Assuming this table definition:
CREATE TABLE mytable (
id serial PRIMARY KEY
, data jsonb
);
The JSON array for "people"
wouldn't make sense if there couldn't be multiple persons inside. Your examples with only a single person are misleading. Some more revealing test data:
INSERT INTO mytable (data)
VALUES
('{"people": [{"age": "55", "name": "Bill"}], "another_key": "yes"}')
, ('{"people": [{"age": "73", "name": "Bob"}], "another_key": "yes"}')
, ('{"people": [{"age": "73", "name": "Bob"}
,{"age": "77", "name": "Udo"}], "another_key": "yes"}');
The third row has two people.
I suggest a query with a LATERAL
join:
SELECT t.id, p.person
FROM mytable t
, jsonb_array_elements(t.data->'people') p(person) -- implicit LATERAL
WHERE (t.data->'people') @> '[{"name": "Bob"}]'
AND p.person->>'name' = 'Bob'
AND (p.person->>'age')::int > 30;
The first WHERE
condition WHERE (t.data->'people') @> '[{"name": "Bob"}]'
is logically redundant, but it helps performance by eliminating irrelevant rows early: don't even unnest JSON arrays without a "Bob"
in it.
For big tables, this is much more efficient with a matching index. If you run this kind of query regularly, you should have one:
CREATE INDEX mytable_people_gin_idx ON mytable
USING gin ((data->'people') jsonb_path_ops);
Related, with more explanation:
In Postgres 12 or later consider using SQL/JSON path expressions instead. See: