I have to import data without time zone information in it (however, I know the specific time zone of the data I want to import), but I need the timestamp with time zone
format in the database. Once I import it and set the timestamp data type to timestamp with time zone
, Postgres will automatically assume that the data in the table is from my time zone and assign my time zone to it. Unfortunately the data I want to import is not from my time frame, so this does not work.
The database also contains data with different time zones. However, the time zone within one table is always the same.
Now, I could set the time zone of the database to the time zone of the data I want to import before importing the data (using SET time zone
command) and change it back to my time zone once the import is done, and I am pretty sure already stored data will not be affected by the time zone change of the database. But this seems to be a pretty dirty approach and may cause problems later on.
I wonder if there is a more elegant way to specify the time zone for the import without having the time zone data in the data itself?
Also, I have not found a way to edit time zone information after import. Is there a way not to convert, but simply to edit the time zone for a whole table, assuming that the whole table has the same time zone offset (i.e. if a wrong one has been assigned upon data entry/import)?
Edit:
I managed to specify a time zone upon import, the whole command being:
set session time zone 'UTC';
COPY tbl FROM 'c:\Users\Public\Downloads\test.csv' DELIMITERS ',' CSV;
set session time zone 'CET';
The data then gets imported using the session time zone. I assume this has no effect on any other queries on the database at the same time from other connections?
Edit 2:
I found out how to change the time zone of a table afterwards:
PostgreSQL update time zone offset
I suppose it is more elegant to change the time zone of the table after import then to use session to change the local time zone temporary. Assuming the whole table has the same time zone of course.
So the code would be now something along the line of:
COPY tbl FROM 'c:\Users\Public\Downloads\test.csv' DELIMITERS ',' CSV;
UPDATE tbl SET <tstz_field> = <tstz_field> AT TIME ZONE '<correct_time_zone>';
It is a lot more efficient to set the time zone for your import session than to update the values later.
I get the impression that you think of the time zone like a setting that applies to otherwise unchanged values in the tables. But it's not like that at all. Think of it as an input / output modifier. Actual timestamp
values (with or without time zone) are always stored as UTC timestamps internally (number of seconds since '2000-01-01 00:00'
). See:
The UPDATE
in your second example doubles the size of the table, as every single row is invalidated and a new version added (that's how UPDATE
works with MVCC in Postgres). In addition to the expensive operation, VACUUM
will have to do more work later to clean up table bloat. Very inefficient.
It is perfectly safe to SET
the local time zone for the session. This doesn't affect concurrent operations in any way. BTW, SET SESSION
is the same as plain SET
- SESSION
is the default anyway.
If you want to be absolutely sure, you can limit the setting to the current transaction with SET LOCAL
. The manual
The effects of
SET LOCAL
last only till the end of the current transaction, whether committed or not. A special case isSET
followed bySET LOCAL
within a single transaction: theSET LOCAL
value will be seen until the end of the transaction, but afterwards (if the transaction is committed) theSET
value will take effect.
Put together:
BEGIN;
SET LOCAL timezone = 'UTC';
COPY tabledata FROM 'c:\Users\Public\Downloads\test.csv' DELIMITERS ',' CSV;
COMMIT;
Check:
SHOW timezone;