sqlpostgresqldatabase-designforeign-keyspostgresql-9.0

Restrict foreign key relationship to rows of related subtypes


Overview: I am trying to represent several types of entities in a database, which have a number of basic fields in common, and then each has some additional fields that are not shared with the other types of entities. Workflow would frequently involve listing the entities together, so I have decided to have a table with their common fields, and then each entity will have its own table with its additional fields.

To implement: There is a common field, “status”, which all entities have; however, some entities will only support a subset of all possible statuses. I also want each type of entity to enforce the use of its subset of statuses. Finally, I will also want to include this field when listing the entities together, so excluding it from the set of common fields seems incorrect, as this would require a union of the specific type tables and the lack of “implements interface” in SQL means that inclusion of that field would be by-convention.

Why I’m here: Below is an solution that is functional, but I am interested if there is a better or more common way to solve the problem. In particular, the fact that this solution requires me to make a redundant unique constraint and a redundant status field feels inelegant.

create schema test;

create table test.statuses(
    id      integer     primary key
);
create table test.entities(
    id      integer     primary key,
    status  integer,
    unique(id, status),
    foreign key (status) references test.statuses(id)
);

create table test.statuses_subset1(
    id      integer     primary key,
    foreign key (id) references test.statuses(id)
);
create table test.entites_subtype(
    id integer primary key,
    status integer,
    foreign key (id) references test.entities(id),
    foreign key (status) references test.statuses_subset1(id),
    foreign key (id, status) references test.entities(id, status) initially deferred
);

Some data:

insert into test.statuses(id) values
    (1),
    (2),
    (3);
insert into test.entities(id, status) values
    (11, 1),
    (13, 3);
insert into test.statuses_subset1(id) values
    (1), (2);
insert into test.entites_subtype(id, status) values
    (11, 1);

-- Test updating subtype first
update test.entites_subtype
    set status = 2
    where id = 11;
update test.entities
    set status = 2
    where id = 11;

-- Test updating base type first
update test.entities
    set status = 1
    where id = 11;
update test.entites_subtype
    set status = 1
    where id = 11;

/* -- This will fail
insert into test.entites_subtype(id, status) values
    (12, 3);
*/

Solution

  • Simplify building on MATCH SIMPLE behavior of fk constraints

    If at least one column of multicolumn foreign constraint with default MATCH SIMPLE behaviour is NULL, the constraint is not enforced. You can build on that to largely simplify your design.

    CREATE SCHEMA test;
    
    CREATE TABLE test.status(
       status_id  integer PRIMARY KEY
      ,sub        bool NOT NULL DEFAULT FALSE  -- TRUE .. *can* be sub-status
      ,UNIQUE (sub, status_id)
    );
    
    CREATE TABLE test.entity(
       entity_id  integer PRIMARY KEY
      ,status_id  integer REFERENCES test.status  -- can reference all statuses
      ,sub        bool      -- see examples below
      ,additional_col1 text -- should be NULL for main entities
      ,additional_col2 text -- should be NULL for main entities
      ,FOREIGN KEY (sub, status_id) REFERENCES test.status(sub, status_id)
         MATCH SIMPLE ON UPDATE CASCADE  -- optionally enforce sub-status
    );
    

    It is very cheap to store some additional NULL columns (for main entities):

    BTW, per documentation:

    If the refcolumn list is omitted, the primary key of the reftable is used.

    Demo-data:

    INSERT INTO test.status VALUES
      (1, TRUE)
    , (2, TRUE)
    , (3, FALSE);     -- not valid for sub-entities
    
    INSERT INTO test.entity(entity_id, status_id, sub) VALUES
      (11, 1, TRUE)   -- sub-entity (can be main, UPDATES to status.sub cascaded)
    , (13, 3, FALSE)  -- entity  (cannot be sub,  UPDATES to status.sub cascaded)
    , (14, 2, NULL)   -- entity  (can    be sub,  UPDATES to status.sub NOT cascaded)
    , (15, 3, NULL)   -- entity  (cannot be sub,  UPDATES to status.sub NOT cascaded)
    

    SQL Fiddle (including your tests).

    Alternative with single FK

    Another option would be to enter all combinations of (status_id, sub) into the status table (there can only be 2 per status_id) and only have a single fk constraint:

    CREATE TABLE test.status(
       status_id  integer
      ,sub        bool DEFAULT FALSE
      ,PRIMARY KEY (status_id, sub)
    );
    
    CREATE TABLE test.entity(
       entity_id  integer PRIMARY KEY
      ,status_id  integer NOT NULL  -- cannot be NULL in this case
      ,sub        bool NOT NULL     -- cannot be NULL in this case
      ,additional_col1 text
      ,additional_col2 text
      ,FOREIGN KEY (status_id, sub) REFERENCES test.status
         MATCH SIMPLE ON UPDATE CASCADE  -- optionally enforce sub-status
    );
    
    INSERT INTO test.status VALUES
      (1, TRUE)       -- can be sub ...
      (1, FALSE)      -- ... and main
    , (2, TRUE)
    , (2, FALSE)
    , (3, FALSE);     -- only main
    

    Etc.

    Related answers:

    Keep all tables

    If you need all four tables for some reason not in the question consider this detailed solution to a very similar question on dba.SE:

    Inheritance

    ... might be another option for what you describe. If you can live with some major limitations. Related answer: