pythonpandasdatetimeutcpython-datetime

Converting UTC Column into datetime in python pandas


I would like to ask for little support. I have here a python frame containing data giving in UTC format. I would like to transform the column into date-format.

Order Date
15-Feb-2024 UTC
17-Feb-2024 UTC
18-Feb-2024 UTC
02-Apr-2024 UTC
05-Mar-2024 UTC
04-Mar-2024 UTC
11-Apr-2024 UTC
12-Apr-2024 UTC
16-Mar-2024 UTC
04-Apr-2024 UTC
05-Feb-2024 UTC
05-Mar-2024 UTC
14-Apr-2024 UTC


df["Order Date"]=pd.to_datetime(df["Order Date"],utc=True,format='%d-%b-%Y')

Applying the line above gives me the following error

time data "15-Feb-2024 UTC" doesn't match format "%d-%b-%Y", at position 0. You might want to try:
    - passing `format` if your strings have a consistent format;
    - passing `format='ISO8601'` if your strings are all ISO8601 but not necessarily in exactly the same format;
    - passing `format='mixed'`, and the format will be inferred for each element individually. You might want to use `dayfirst` alongside this.

I tried all the options, without success. Can someone tell me where the issue is? All I need is to transform the column to a date column. I'd be grateful for any support


Solution

  • Example Code

    import pandas as pd
    data = {'Order Date': ['15-Feb-2024 UTC', '17-Feb-2024 UTC', '18-Feb-2024 UTC', '02-Apr-2024 UTC', '05-Mar-2024 UTC']}
    df = pd.DataFrame(data)
    

    df

        Order Date
    0   15-Feb-2024 UTC
    1   17-Feb-2024 UTC
    2   18-Feb-2024 UTC
    3   02-Apr-2024 UTC
    4   05-Mar-2024 UTC
    

    Code

    if you want convert to datetime64[ns, UTC]

    out = pd.to_datetime(df["Order Date"], utc=True, format='%d-%b-%Y UTC')
    

    out:

    0   2024-02-15 00:00:00+00:00
    1   2024-02-17 00:00:00+00:00
    2   2024-02-18 00:00:00+00:00
    3   2024-04-02 00:00:00+00:00
    4   2024-03-05 00:00:00+00:00
    Name: Order Date, dtype: datetime64[ns, UTC]
    

    or if you want convert to datetime[ns]

    out = pd.to_datetime(df['Order Date'].str.replace(' UTC', ''))
    

    out:

    0   2024-02-15
    1   2024-02-17
    2   2024-02-18
    3   2024-04-02
    4   2024-03-05
    Name: Order Date, dtype: datetime64[ns]