sqlruby-on-railspostgresql

Rows matching conditions in a directly related table or another, related indirectly


Suppose three tables:

create table table_a(  id int generated by default as identity primary key);
create table table_c(  id int generated by default as identity primary key
                     , active boolean);
create table table_b(  id int generated by default as identity primary key
                     , active boolean
                     , table_a_id int references table_a(id)
                     , table_c_id int references table_c(id));

Rows in table_a can be related to those in table_c through table_b.
The table_c and table_b have each a boolean attribute called active.

How do I write a SQL statement or Active record expression where I get every table_a record

Example input:
demo at db<>fiddle

with populated_a as 
  (insert into table_a values
     (1) --YES, doesn't have a `b` or `c` at all
    ,(2) --NO, has an active `b`
    ,(3) --YES, has a `b` but it's inactive
    ,(4) --NO, has both an active `b` and active `c`
    ,(5) --YES, has an active `b` but it links to an inactive `c`
    ,(6) --YES, has an inactive `b` and it links to an inactive `c`
   returning *)
,populated_c as 
  (insert into table_c values(2,false)
                            ,(3,true)
                            ,(4,true)
                            ,(5,false)
                            ,(6,false)
                            ,(7,true)--no associated `a` or `b`
   returning *)
insert into table_b(active,table_c_id,table_a_id)
values (true,2,2)
      ,(false,3,3)
      ,(true,4,4)--also links to active `c`
      ,(true,5,5)--links to inactive `c`
      ,(false,6,6);

My attempt:

SELECT "table_a".* FROM "table_a"
LEFT OUTER JOIN "table b"
ON "table_b"."table_a_id" = "table_a"."id"
AND "table_b"."active" = true
LEFT OUTER JOIN "table_c"
ON "table_b"."table_c_id" = "table_c"."id" 
AND "table_c"."active" = true
WHERE "table_b"."id" IS NULL OR "table_c"."id" IS NULL

The expected output is rows (1,3,5,6). My code returns them all, but it also incorrectly returns 2, which has an active b.

Edit: I think I didn't express myself well when writing the question, since I said I wanted any with active true but in reality what I wanted is either without that table b or any table b related with a table c which had NOT a an active as true (which means, could be false or not exist)!

Thanks for Zegarek which made me realize by mistake, I added a new fiddle which corrects the mistake:

https://dbfiddle.uk/zUnOnkHg

The answers are 1 (no related table b),2 and 5 (both have table b with false table c)


Solution

  • Basically I did a two LEFT OUTER JOIN deep inside each other, which ended up working as I intended:

    SELECT "table_a".* FROM "table_a"
    LEFT OUTER JOIN "table b"
    ON "table_b"."table_a_id" = "table_a"."id"
    AND "table_b"."active" = true
    LEFT OUTER JOIN "table_c"
    ON "table_b"."table_c_id" = "table_c"."id" 
    AND "table_c"."active" = true
    WHERE "table_b"."id" IS NULL OR "table_c"."id" IS NULL
    

    Edit: Indeed, on thta case still had a flaw where I would retrieve Table A recors that were associated with both active and false table C records, where I just wanted if it had or a false one or none at all. This code here indeed solves the issue:

    select tr.id
    from (SELECT ta.id, max(tc.active::int) as result
    from table_a ta
    left join table_b tb on tb.table_a_id = ta.id
    left join table_c tc on tb.table_c_id = tc.id
    group by ta.id) tr where result is NULL OR result = 0
    

    https://dbfiddle.uk/y9OpmLDl