I have a pandas dataframe containing sensor data with timestamp columns
id time temp humi
min max mean std mean std
0 1.0 2023-03-29 14:24:12-03:00 2023-03-29 14:24:25-03:00 19.9 0.0 84.73 0.06
Where the columns ('time', 'min')
and ('time', 'max')
are timezone-aware, e.g. 2023-03-29 14:24:12-03:00
When I load this dataframe to a PostgreSQL database using the to_sql
method like this:
dataframe.to_sql(table, engine, if_exists='append', index=False)
I noticed that the timezone information is not being preserved in the database. For instance, the value 2023-03-29 14:24:12-03:00
is being stored as 2023-03-29 17:24:12+00
The table in the database is created with the following SQL:
CREATE TABLE IF NOT EXISTS public.sensors
(
"('id', '')" smallint,
"('time', 'min')" timestamp with time zone,
"('time', 'max')" timestamp with time zone,
"('temp', 'mean')" float4,
"('temp', 'std')" float4,
"('humi', 'mean')" float4,
"('humi', 'std')" float4
)
I suspect that the problem is related to the use of to_sql
method rather than Postgres or the dataframe, but I'm not sure how to fix it.
How can I ensure that the timezone information is properly stored in the database?
I try:
dataframe.to_sql(table, engine, if_exists='append', index=False)
But:
I noticed that the timezone information is not being preserved in the database. For instance, the value 2023-03-29 14:24:12-03:00
is being stored as 2023-03-29 17:24:12+00
As Belayer commented on my question he/she is right!
Postgres is storing the time into UTC, and both hours I posted originaly are the same hour one is local and the other is UTC, so there is actualy no problem at all, I was the one that didn't know that postgres doesn't store the timezone and didn't realize that everything was perfectly fine.
Thank you Belayer!