I'm using PostgreSQL 14 to maintain a table which acts as an append-only ledger of changes done to a students
table:
CREATE TABLE students (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
city TEXT NOT NULL,
mentor_of INTEGER REFERENCES students(id)
);
CREATE TABLE edits (
id SERIAL PRIMARY KEY,
student_id INTEGER REFERENCES students(id) NOT NULL,
name TEXT,
city TEXT,
mentor_of INTEGER REFERENCES students(id)
);
Each row in the edits
table references some student, the columns name
, city
and mentor_of
give the new value to be applied to the respective column in the student
table. A NULL
value indicates that the field was not touched.
This works beautifully, but I lack a way to express that a column should be set to NULL. I.e. there is no way to have an edit
row which indicates that the duplicate_of
column of some student is to be set to NULL
.
I considered adjusting the edits
table such that each column is actually an array of one element. Thus, NULL
indicates that e.g. the name was not touched, {NULL}
indicates that the name was set to NULL
and {Jill}
indicates that the name was set to Jill
. Unfortunately, arrays don't support foreign key constraints (yet?), so using this for the mentor_of
column means weakening referential integrity.
I also looked into defining a composite type (like maybe_student_id
which features one boolean field to indicate set/unset and the other field gives the new value to use). Alas, composite types don't seem to support foreign keys either.
Is there a way to have a (preferably generic, i.e. not type-specific) way to get an additional sentinel value on top of NULL
?
You can add flags (as boolean columns) to indicate the changes. This way nulls will be applied as necessary. For example:
CREATE TABLE edits (
id int not null PRIMARY KEY generated by default as identity,
student_id INTEGER not null REFERENCES students(id),
name varchar(100),
city varchar(50),
mentor_of INTEGER REFERENCES students(id),
name_modified boolean, -- added
city_modified boolean, -- added
mentor_of_modified boolean -- added
);
Of course that pollutes the design. Other option is to consolitate all flags into a boolean array, as in:
CREATE TABLE edits (
id int not null PRIMARY KEY generated by default as identity,
student_id INTEGER not null REFERENCES students(id),
name varchar(100),
city varchar(50),
mentor_of INTEGER REFERENCES students(id),
modified boolean[] -- position 0 is for name, 1 for city, etc.
);
Another option is to consolidate everything into a single JSON column that indicates the changes. For example:
CREATE TABLE edits (
id int not null PRIMARY KEY generated by default as identity,
student_id INTEGER not null REFERENCES students(id),
changes json
);
In this case if changes
has the value {"name":"abc","city":null}
that means that name
and city
are to be modified while mentor_of
(not mentioned) should not be touched.