postgresqlsequence

In-order sequence generation


Is there a way to generate some kind of in-order identifier for a table records?

Suppose that we have two threads doing queries:

Thread 1:

begin;
insert into table1(id, value) values (nextval('table1_seq'), 'hello');
commit;

Thread 2:

begin;
insert into table1(id, value) values (nextval('table1_seq'), 'world');
commit;

It's entirely possible (depending on timing) that an external observer would see the (2, 'world') record appear before the (1, 'hello').

That's fine, but I want a way to get all the records in the 'table1' that appeared since the last time the external observer checked it.

So, is there any way to get the records in the order they were inserted? Maybe OIDs can help?


Solution

  • No. Since there is no natural order of rows in a database table, all you have to work with is the values in your table.

    Well, there are the Postgres specific system columns cmin and ctid you could abuse to some degree.

    The tuple ID (ctid) contains the file block number and position in the block for the row. So this represents the current physical ordering on disk. Later additions will have a bigger ctid, normally. Your SELECT statement could look like this

    SELECT *, ctid   -- save ctid from last row in last_ctid
    FROM   tbl
    WHERE  ctid > last_ctid
    ORDER  BY ctid
    

    ctid has the data type tid. Example: '(0,9)'::tid

    However it is not stable as long-term identifier, since VACUUM or any concurrent UPDATE or some other operations can change the physical location of a tuple at any time. For the duration of a transaction it is stable, though. And if you are just inserting and nothing else, it should work locally for your purpose.

    Careful with table inheritance or partitioning. Then there can be multiple physical tables involved and ctid is not unique within the scope. See:

    I would add a timestamp column with default now() in addition to the serial column ...

    I would also let a column default populate your id column (a serial or IDENTITY column). That retrieves the number from the sequence at a later stage than explicitly fetching and then inserting it, thereby minimizing (but not eliminating) the window for a race condition - the chance that a lower id would be inserted at a later time. Detailed instructions: