pythonpandas

pandas conert date failed - pd.to_datetime(df['xxx'], format='%Y-%m-%d').dt.date


I am facing one little problem. I am storing some date time data and the data is

#secCode,secName,announcementTitle,announcementId,announcementTime

003816,xxx name,2024report,1222913141,1743004800000

the date time column is clearly string - 1743004800000, so when I try to convert it

df['announcementTime'] = pd.to_datetime(df['announcementTime'], format='%Y-%m-%d').dt.date

I got this error

  File "/root/miniconda3/lib/python3.12/site-packages/pandas/core/tools/datetimes.py", line 1072, in to_datetime
    values = convert_listlike(arg._values, format)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/pandas/core/tools/datetimes.py", line 435, in _convert_listlike_datetimes
    return _array_strptime_with_fallback(arg, name, utc, format, exact, errors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: time data "1743004800000" doesn't match format "%Y-%m-%d", at position 0. You might want to try:
    - passing `format` if your strings have a consistent format;
    - passing `format='ISO8601'` if your strings are all ISO8601 but not necessarily in exactly the same format;
    - passing `format='mixed'`, and the format will be inferred for each element individually. You might want to use `dayfirst` alongside this.
    df['announcementTime'] = pd.to_datetime(df['announcementTime'], format='%Y-%m-%d')
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/pandas/core/tools/datetimes.py", line 1072, in to_datetime

Solution

  • Example

    example for reproducing

    import pandas as pd
    df = pd.DataFrame({
        "announcementTime": [1743004800000, 1753004800000]
    })
    

    df

       announcementTime
    0     1743004800000
    1     1753004800000
    

    Answer

    Think of your input timestamp data as a number in ms, not '%Y-%m-%d' format

    # use unit='ms' because the timestamp is in milliseconds
    df["announcementTime"] = pd.to_datetime(df["announcementTime"], unit="ms")
    
    # If you just want the date, use .dt.date
    df["announcementDate"] = df["announcementTime"].dt.date
    

    df

         announcementTime announcementDate
    0 2025-03-26 16:00:00       2025-03-26
    1 2025-07-20 09:46:40       2025-07-20
    

    If it's not in ms, modify it to the appropriate unit.