pythonpandasdatetimepython-datetimepandas-settingwithcopy-warning

Issues updating a column with pd.to_datetime()


I am having some issues updating a column within my dataframe. The data I receive is data/time so I want to format it according to pd.to_datetime. But the change to DST started to create some issues.

My data looks as follow:

0     2023-10-08T19:44:29.862-04:00
1     2023-10-09T01:06:42.762-04:00
2     2023-10-12T18:35:09.060-04:00
3     2023-10-12T18:55:02.459-04:00
4     2023-10-17T23:56:28.330-04:00
5     2023-10-18T02:51:33.481-04:00
6     2023-10-24T15:59:30.074-04:00
7     2023-10-24T18:00:55.081-04:00
8     2023-11-03T15:59:30.021-04:00
9     2023-11-06T15:59:30.824-05:00
10    2023-11-07T15:59:32.386-05:00
11    2023-11-09T15:59:30.032-05:00
12    2023-11-10T15:59:57.509-05:00
14    2023-11-13T15:59:31.065-05:00
Name: Time, dtype: object

I want to have it as a dtype:datetime so I can manipulate it (change timezone) easily.

The operations I want to do are as follow:

df.Time = pd.to_datetime(df.Time, errors='coerce',utc=True)
df.Time = df.Time.dt.tz_convert("Australia/Sydney")
df.Time = pd.to_datetime(df.Time) - pd.Timedelta(8, "H")

The error Warning I get from this is: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

But if changing df.Time to df.loc[:, "Time"] The code does not compute and has this error for the second line: AttributeError: Can only use .dt accessor with datetimelike values

I also tried to change df.Time to df.["Time"] or adding .copy() at the end of sentences but still had the same error.

Is there something I am missing? The original code does exactly what I want but throws at me a warning and when trying to solve the warning, it creates an error.


Solution

  • Works perfectly fine with pandas v2; minimal reproducible example below. Note that you can simplify marginally by removing the second call to pd.to_datetime.

    import pandas as pd
    
    df = pd.DataFrame(
        {
            "Time": [
                "2023-10-24T18:00:55.081-04:00",
                "2023-11-03T15:59:30.021-04:00",
                "2023-11-06T15:59:30.824-05:00",
                "2023-11-07T15:59:32.386-05:00",
                "2023-11-09T15:59:30.032-05:00",
            ],
        }
    )
    
    df["Time"] = pd.to_datetime(df["Time"], errors="coerce", utc=True)
    df["Time"] = df.Time.dt.tz_convert("Australia/Sydney")
    df["Time"] -= pd.Timedelta(8, "H")
    
    df["Time"]
    0   2023-10-25 01:00:55.081000+11:00
    1   2023-11-03 22:59:30.021000+11:00
    2   2023-11-06 23:59:30.824000+11:00
    3   2023-11-07 23:59:32.386000+11:00
    4   2023-11-09 23:59:30.032000+11:00
    Name: Time, dtype: datetime64[ns, Australia/Sydney]
    
    pd.__version__
    '2.1.2'