I get this error when trying to append two Pandas DFs together in a for loop:
Aggdata=Aggdata.append(Newdata)
This is the full error:
File "pandas\tslib.pyx", line 4096, in pandas.tslib.tz_localize_to_utc (pandas
\tslib.c:69713)
pytz.exceptions.NonExistentTimeError: 2017-03-12 02:01:24
However, in my files, I do not have such a time stamp, but I do have ones like 03/12/17 00:45:26 or 03/12/17 00:01:24. Where it is 2 hours before daylight savings. And if I manually delete the offending row, I get that same error for the next row with times between 12 and 1am on the 12th of March.
My original date/time column has no TZ info, but I calculate another column in EST, before the concatenation and localize it to EST, with time with TZ information:
`data['EST_DateTimeStamp']=pd.DatetimeIndex(pd.to_datetime(data['myDate'])).tz_localize('US/Eastern').tz_convert('US/Eastern')`
Doing some research here, I understand that 2 to 3am on the 12th should be having such error, but why midnight to 1am. So am I localizing it incorrectly? and then why is the error on the append line, and not the localization line?
I was able to reproduce this behavior in a very simple MCVE, saved here: https://codeshare.io/GLjrLe
It absolutely boggles my mind that the error is raised on the third append, and only if the next 3 appends follow. In others words, if I comment out the last 3 copies of appends, it works fine.. can't imagine what is happening.
Thank you for reading.
In case someone else may still find this helpful:
Talking about it with @hashcode55, the solution was to upgrade Pandas on my server, as this was likely a bug in my previous version of that module.