I am having some issues updating a column within my dataframe. The data I receive is data/time so I want to format it according to pd.to_datetime. But the change to DST started to create some issues.
My data looks as follow:
0 2023-10-08T19:44:29.862-04:00
1 2023-10-09T01:06:42.762-04:00
2 2023-10-12T18:35:09.060-04:00
3 2023-10-12T18:55:02.459-04:00
4 2023-10-17T23:56:28.330-04:00
5 2023-10-18T02:51:33.481-04:00
6 2023-10-24T15:59:30.074-04:00
7 2023-10-24T18:00:55.081-04:00
8 2023-11-03T15:59:30.021-04:00
9 2023-11-06T15:59:30.824-05:00
10 2023-11-07T15:59:32.386-05:00
11 2023-11-09T15:59:30.032-05:00
12 2023-11-10T15:59:57.509-05:00
14 2023-11-13T15:59:31.065-05:00
Name: Time, dtype: object
I want to have it as a dtype:datetime so I can manipulate it (change timezone) easily.
The operations I want to do are as follow:
df.Time = pd.to_datetime(df.Time, errors='coerce',utc=True)
df.Time = df.Time.dt.tz_convert("Australia/Sydney")
df.Time = pd.to_datetime(df.Time) - pd.Timedelta(8, "H")
The error Warning I get from this is: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
But if changing df.Time to df.loc[:, "Time"] The code does not compute and has this error for the second line: AttributeError: Can only use .dt accessor with datetimelike values
I also tried to change df.Time to df.["Time"] or adding .copy() at the end of sentences but still had the same error.
Is there something I am missing? The original code does exactly what I want but throws at me a warning and when trying to solve the warning, it creates an error.
Works perfectly fine with pandas v2; minimal reproducible example below. Note that you can simplify marginally by removing the second call to pd.to_datetime.
import pandas as pd
df = pd.DataFrame(
{
"Time": [
"2023-10-24T18:00:55.081-04:00",
"2023-11-03T15:59:30.021-04:00",
"2023-11-06T15:59:30.824-05:00",
"2023-11-07T15:59:32.386-05:00",
"2023-11-09T15:59:30.032-05:00",
],
}
)
df["Time"] = pd.to_datetime(df["Time"], errors="coerce", utc=True)
df["Time"] = df.Time.dt.tz_convert("Australia/Sydney")
df["Time"] -= pd.Timedelta(8, "H")
df["Time"]
0 2023-10-25 01:00:55.081000+11:00
1 2023-11-03 22:59:30.021000+11:00
2 2023-11-06 23:59:30.824000+11:00
3 2023-11-07 23:59:32.386000+11:00
4 2023-11-09 23:59:30.032000+11:00
Name: Time, dtype: datetime64[ns, Australia/Sydney]
pd.__version__
'2.1.2'