databasepostgresqlsentinel

How could I implement a generic 'not set' sentinel value for a column in a PostgreSQL table?


I'm using PostgreSQL 14 to maintain a table which acts as an append-only ledger of changes done to a students table:

CREATE TABLE students (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  city TEXT NOT NULL,
  mentor_of INTEGER REFERENCES students(id)
);

CREATE TABLE edits (
  id SERIAL PRIMARY KEY,
  student_id INTEGER REFERENCES students(id) NOT NULL,
  name TEXT,
  city TEXT,
  mentor_of INTEGER REFERENCES students(id)
);

Each row in the edits table references some student, the columns name, city and mentor_of give the new value to be applied to the respective column in the student table. A NULL value indicates that the field was not touched.

This works beautifully, but I lack a way to express that a column should be set to NULL. I.e. there is no way to have an edit row which indicates that the duplicate_of column of some student is to be set to NULL.

I considered adjusting the edits table such that each column is actually an array of one element. Thus, NULL indicates that e.g. the name was not touched, {NULL} indicates that the name was set to NULL and {Jill} indicates that the name was set to Jill. Unfortunately, arrays don't support foreign key constraints (yet?), so using this for the mentor_of column means weakening referential integrity.

I also looked into defining a composite type (like maybe_student_id which features one boolean field to indicate set/unset and the other field gives the new value to use). Alas, composite types don't seem to support foreign keys either.

Is there a way to have a (preferably generic, i.e. not type-specific) way to get an additional sentinel value on top of NULL?


Solution

  • You can add flags (as boolean columns) to indicate the changes. This way nulls will be applied as necessary. For example:

    CREATE TABLE edits (
      id int not null PRIMARY KEY generated by default as identity,
      student_id INTEGER not null REFERENCES students(id),
      name varchar(100),
      city varchar(50),
      mentor_of INTEGER REFERENCES students(id),
      name_modified boolean, -- added
      city_modified boolean, -- added
      mentor_of_modified boolean -- added
    );
    

    Of course that pollutes the design. Other option is to consolitate all flags into a boolean array, as in:

    CREATE TABLE edits (
      id int not null PRIMARY KEY generated by default as identity,
      student_id INTEGER not null REFERENCES students(id),
      name varchar(100),
      city varchar(50),
      mentor_of INTEGER REFERENCES students(id),
      modified boolean[] -- position 0 is for name, 1 for city, etc.
    );
    

    Another option is to consolidate everything into a single JSON column that indicates the changes. For example:

    CREATE TABLE edits (
      id int not null PRIMARY KEY generated by default as identity,
      student_id INTEGER not null REFERENCES students(id),
      changes json
    );
    

    In this case if changes has the value {"name":"abc","city":null} that means that name and city are to be modified while mentor_of (not mentioned) should not be touched.