I am using psql with a PostgreSQL database and the following copy command:
\COPY isa (np1, np2, sentence) FROM 'c:\Downloads\isa.txt' WITH DELIMITER '|'
I get:
ERROR: extra data after last expected column
How can I skip the lines with errors?
\copy is just a wrapper around SQL COPY that channels results through psql.
ON_ERROR ignore now allows to skip error rows. Changes are summed up in the release notes of Postgres 17.
The Postgres 17 manual on COPY
By default,
COPYwill fail if it encounters an error during processing. For use cases where a best-effort attempt at loading the entire file is desired, theON_ERRORclause can be used to specify some other behavior.
And:
ON_ERRORSpecifies how to behave when encountering an error converting a column's input value into its data type. An
error_actionvalue ofstopmeans fail the command, whileignoremeans discard the input row and continue with the next one. The default isstop.The
ignoreoption is applicable only forCOPY FROMwhen theFORMATistextorcsv.A
NOTICEmessage containing the ignored row count is emitted at the end of theCOPY FROMif at least one row was discarded. WhenLOG_VERBOSITYoption is set to verbose, aNOTICEmessage containing the line of the input file and the column name whose input conversion has failed is emitted for each discarded row.
To run \copy in psql (on your Windows server) discarding error rows:
\copy isa (np1, np2, sentence) FROM 'c:\Downloads\isa.txt' (DELIMITER '|', ON_ERROR ignore);
You cannot skip the errors without skipping the whole command. There is no more sophisticated error handling.
The Postgres 16 manual for COPY:
COPYstops operation at the first error. This should not lead to problems in the event of aCOPY TO, but the target table will already have received earlier rows in aCOPY FROM. These rows will not be visible or accessible, but they still occupy disk space. This might amount to a considerable amount of wasted disk space if the failure happened well into a large copy operation. You might wish to invokeVACUUMto recover the wasted space.
Bold emphasis mine. And:
COPY FROMwill raise an error if any line of the input file contains more or fewer columns than are expected.
There was an attempt to add error logging to COPY in Postgres 9.0 but it was never committed.
Fix your input file instead.
If you have one or more additional columns in your input file and the file is otherwise consistent, you might add dummy columns to your table isa and drop those afterwards. Or (cleaner with production tables) import to a temporary staging table and INSERT selected columns (or expressions) to your target table isa from there.
Related answers with detailed instructions: