postgresqldatabase-designprimary-keysurrogate-key

What are the disadvantages in using IDs suffixed by a table identifying number?


I'm playing with:

DROP TABLE users CASCADE;
CREATE SEQUENCE users_id_seq MINVALUE 100;
CREATE TABLE users (
    id INTEGER NOT NULL PRIMARY KEY DEFAULT nextval('users_id_seq') * 100 + 1,
    ...
);
ALTER SEQUENCE users_id_seq OWNED BY users.id;

DROP TABLE txns CASCADE;
CREATE SEQUENCE txns_seq MINVALUE 100;
CREATE TABLE txns (
    id INTEGER NOT NULL PRIMARY KEY DEFAULT nextval('txns_seq') * 100 + 2,
    ...
);
ALTER SEQUENCE txns_seq OWNED BY txns.id;
...

To my mind, this has the following benefits:

Disadvantages:

What other disadvantages haven't I considered?


Solution

  • As has been commented, there are good reasons not to go there. Canonically, a surrogate PK is a unique identifier for rows within its table without carrying additional information.

    I have not had cases of "wrong-table errors" or "confusion with iteration counters", yet. Reducing the key space of an integer (int4) PK by 99% may come around to bite you later, as well as limiting yourself to a maximum of 100 tables. If you start relying on the meaning of the number (which I would advise against), you better make sure that new tables stick to the rules. And nobody inserts manual ID values or alters ID values ...

    That said, if you go there, consider:

    CREATE SEQUENCE users_id_seq MINVALUE 1 INCREMENT 100;  -- ①
    CREATE TABLE users (
        users_id bigint PRIMARY KEY DEFAULT nextval('users_id_seq'), -- ②
        ...
    );
    ALTER SEQUENCE users_id_seq OWNED BY users.id;
    
    CREATE SEQUENCE txns_id_seq MINVALUE 2 INCREMENT 100;
    CREATE TABLE txns (
        txns_id bigint PRIMARY KEY DEFAULT nextval('txns_id_seq'),
        ...
    );
    ALTER SEQUENCE txns_id_seq OWNED BY txns.id;
    

    ① Use MINVALUE 1 INCREMENT 100, MINVALUE 2 INCREMENT 100 etc. instead of manual calculation.

    ② Use bigint to more than make up for the reduced key space.
    Also, it's odd that you'd put so much effort into unique IDs across the DB, and then use non-descriptive, duplicative column names like id. txns_id is a better naming convention.

    Downside: more susceptible to setval() or ALTER SEQUENCE breaking the regime. Revoke the UPDATE privilege on sequences from all plain users.