pythonpandasdataframedatetimestring-to-datetime

How a find an error occurring in rows in dataframe in Python?


 df["Dt_Customer"] = pd.to_datetime(df["Dt_Customer"],format='%d-%m-%y')

I have tried to convert the date column, the dataset contains more than 1 million rows... I have to find the date rows which are not getting converted.

TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-124-d701d963ff8c> in <module>
 ----> 1 df["Dt_Customer"] = pd.to_datetime(df["Dt_Customer"],format='%d-%m-%y')

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\tools\datetimes.py 
in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, 
origin, cache)
803             result = arg.map(cache_array)
804         else:
--> 805             values = convert_listlike(arg._values, format)
806             result = arg._constructor(values, index=arg.index, name=arg.name)
807     elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)):

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\tools\datetimes.py 
in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, 
yearfirst, exact)
458                 return DatetimeIndex._simple_new(dta, name=name)
459             except (ValueError, TypeError):
--> 460                 raise e
461 
462     if result is None:

c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\tools\datetimes.py 
in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, 
yearfirst, exact)
421             if result is None:
422                 try:
--> 423                     result, timezones = array_strptime(
424                         arg, format, exact=exact, errors=errors
425                     )

pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()

ValueError: unconverted data remains: 12

Solution

  • you could try a loop with try and except:

    causing_error_list = []
    for x in df["Dt_Customer"].values:
        try:
            pd.to_datetime(x,format='%d-%m-%y')
        except:
            causing_error_list.append(x)