I have 2 tables with a 1-to-many relationship. Table A contains 'OBJECTS' and table B contains 'COMPONENTS' have id's that are related and I wrote a query to gather all OBJECTS from table A and associated COMPONENTS from table B that looks like this:
SELECT A.*, ARRAY_AGG(B.*)
FROM A INNER JOIN B USING(id)
GROUP BY A.*
This gives me what I expect.
Next I have a field on B called 'type' that can be 'X', 'Y', or 'Z'. I want to do the same query above, but filter the results such that only OBJECTS that have associated COMPONENTS where B.type = 'X'. Furthermore, I still want all associated COMPONENTS that aren't B.type = 'X' for each returned OBJECT.
My first attempt was to write the following query:
SELECT A.*, ARRAY_AGG(B.*)
FROM A INNER JOIN B USING(id)
WHERE B.type = 'X'
GROUP BY A.*
But this query strips COMPONENTS that aren't type = 'X'.
Is there a way to get all OBJECTS that have associated COMPONENTS with type = 'X' but also the result set returns all associated COMPONENTS for each object regardless of type?
Here's an example dataset with the expected result set:
A:
id |
---|
1 |
2 |
3 |
4 |
B:
id | type |
---|---|
1 | X |
1 | Y |
1 | Z |
2 | Y |
2 | Z |
3 | X |
4 | Z |
Result:
(1, [(1,X),(1,Y),(1,Z)]),
(3, [(3,X)])
These queries are part of an SQLX Rust project, if that makes a difference.
While referential integrity is enforced, and you query for all IDs, you don't even have to include table A in the query:
SELECT id, array_agg(B.*)
FROM B
GROUP BY id
HAVING bool_or(true) FILTER (WHERE type = 'X');
About the aggregate FILTER
clause:
Should be as fast as it gets - unless type = 'X'
is rare.
In which case - assuming B(type, id)
is unique:
SELECT id, array_agg(B.*)
FROM (SELECT id FROM B WHERE type = 'X') b1
JOIN B USING (id)
GROUP BY 1;
With one index on (type, id)
- could be the UNIQUE
constraint with leading type
.
(Better yet, a partial index on (id) WHERE type = 'X'
, but that may be too specialized.)
And another one on just (id)
, which should be a given.
If B(type, id)
is not unique, add DISTINCT
to the subquery or switch to IN
or EXISTS
, like:
SELECT id, array_agg(B.*)
FROM B
WHERE EXISTS (SELECT FROM B b1 WHERE b1.id = b.id AND b1.type = 'X')
GROUP BY 1;