I have a dataframe with a column in it -
date_col
2024-05-12T17:46:50.746922-07:00
2024-05-12T17:31:35.438304-07:00
2024-05-12T17:46:50.642095-07:00
2024-05-12T17:02:02.299320-07:00
I tried below code -
df['updated'] = datetime.fromisoformat(str(df['date_col'])).astimezone(timezone.utc).isoformat(timespec="milliseconds")
But its giving error -
TypeError: fromisoformat: argument must be str
print(type(df['date_col'])) gives <class 'pandas.core.series.Series'>
print(df.dypes) gives date_col object
Expected output is in form of - 2024-05-13T00:46:50.746Z
Any help is appreciated.
I'd try something like this:
import pandas as pd
import pytz
# Convert the column to datetime objects
df['date_col'] = pd.to_datetime(df['date_col'], utc=True)
# Convert to UTC and format as expected
df['updated'] = df['date_col'].dt.tz_convert('UTC').dt.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
Some explanation:
After running this code, the updated column in your DataFrame should contain the expected output format.
!!! If the strings in your date_col are not already in UTC, you'll need to adjust the pd.to_datetime call to specify the correct timezone or format.