postgresqllibpqxx

PostgreSQL: Return auto-generated ids from COPY FROM insertion


I have a non-empty PostgreSQL table with a GENERATED ALWAYS AS IDENTITY column id. I do a bulk insert with the C++ binding pqxx::stream_to, which I'm assuming uses COPY FROM. My problem is that I want to know the ids of the newly created rows, but COPY FROM has no RETURNING clause. I see several possible solutions, but I'm not sure if any of them is good, or which one is the least bad:

I would assume that this is a common situation, yet I don't see an obviously correct solution. What do you recommend?


Solution

  • You can find out which rows have been affected by your current transaction using the system columns. The xmin column contains the ID of the inserting transaction, so to return the id values you just copied, you could:

    BEGIN;
    
    COPY foo(col1,col2) FROM STDIN;
    
    SELECT id FROM foo
    WHERE xmin::text = (txid_current() % (2^32)::bigint)::text
    ORDER BY id;
    
    COMMIT;
    

    The WHERE clause comes from this answer, which explains the reasoning behind it.

    I don't think there's any way to optimise this with an index, so it might be too slow on a large table. If so, I think your second option would be the way to go, i.e. stream into a temp table and INSERT ... RETURNING.