sqlpostgresqlpostgresql-9.4jsonbset-returning-functions

JSONB: more than one row returned by a subquery used as an expression


I am (still) new to postgresql and jsonb. I am trying to select some records from a subquery and am stuck. My data column looks like this (jsonb):

{"people": [{"age": "50", "name": "Bob"}], "another_key": "no"}
{"people": [{"age": "73", "name": "Bob"}], "another_key": "yes"}

And here is my query. I want to select all names that are "Bob" whose age is greater than 30:

SELECT * FROM mytable
WHERE (SELECT (a->>'age')::float
       FROM (SELECT jsonb_array_elements(data->'people') as a
             FROM mytable) as b 
       WHERE a @> json_object(ARRAY['name', 'Bob'])::jsonb
      ) > 30;

I get the error:

more than one row returned by a subquery used as an expression

I don't quite understand. If I do some simple substitution (just for testing) I can do this:

SELECT * FROM mytable
WHERE  (50) > 30  -- 50 is the age of the youngest Bob

and that returns both rows.


Solution

  • The error means just what it says:

    more than one row returned by a subquery used as an expression

    The expression in the WHERE clause expects a single value (just like you substituted in your added test), but your subquery returns multiple rows. jsonb_array_elements() is a set-returning function.

    Assuming this table definition:

    CREATE TABLE mytable (
      id   serial PRIMARY KEY
    , data jsonb
    );
    

    The JSON array for "people" wouldn't make sense if there couldn't be multiple persons inside. Your examples with only a single person are misleading. Some more revealing test data:

    INSERT INTO mytable (data)
    VALUES
      ('{"people": [{"age": "55", "name": "Bill"}], "another_key": "yes"}')
    , ('{"people": [{"age": "73", "name": "Bob"}], "another_key": "yes"}')
    , ('{"people": [{"age": "73", "name": "Bob"}
                   ,{"age": "77", "name": "Udo"}], "another_key": "yes"}');
    

    The third row has two people.

    I suggest a query with a LATERAL join:

    SELECT t.id, p.person
    FROM   mytable t 
         , jsonb_array_elements(t.data->'people') p(person)  -- implicit LATERAL
    WHERE (t.data->'people') @> '[{"name": "Bob"}]'
    AND    p.person->>'name' = 'Bob'
    AND   (p.person->>'age')::int > 30;
    

    fiddle

    The first WHERE condition WHERE (t.data->'people') @> '[{"name": "Bob"}]' is logically redundant, but it helps performance by eliminating irrelevant rows early: don't even unnest JSON arrays without a "Bob" in it.

    For big tables, this is much more efficient with a matching index. If you run this kind of query regularly, you should have one:

    CREATE INDEX mytable_people_gin_idx ON mytable
    USING gin ((data->'people') jsonb_path_ops);
    

    Related, with more explanation:

    In Postgres 12 or later consider using SQL/JSON path expressions instead. See: