pythonpandasdataframedatetime

Read-in a csv to Pandas dataframe with timestamp and convert it to isoformat


I have a csv-file I read in with pandas to a dataframe. The time column is formatted in days since 01-01-1900. The aim is to gain an isoformatted column with the "T" between date and time.

The csv-file looks like:

DateTime,column1,column2,column3,column4,column5
43621.6210327662,1.234,2.665,0.888,3.891,3.762
43621.6210445023,2.232,1.233,0.888,3.789,3.524

My code to read in is:

import pandas as pd
df = pd.read_csv(myfile.csv)
df

and the dataframe contains days since 01-01-1900:

    DateTime    column1     column2     column3     column4     column5
0   43621.621033    1.234   2.665   0.888   3.891   3.762
1   43621.621045    2.232   1.233   0.888   3.789   3.524

Now I convert the days-formatted date to a more common look:

from datetime import datetime, timedelta
df['DateTime'] =  pd.to_datetime('1900-01-01') + pd.to_timedelta(df['DateTime'],'D')
df
    DateTime    column1     column2     column3     column4     column5
0   2019-06-07 14:54:17.230999464   1.234   2.665   0.888   3.891   3.762
1   2019-06-07 14:54:18.244998936   2.232   1.233   0.888   3.789   3.524

My aim is to have a isoformatted DateTime column that has the "T" between date and time like:

    DateTime    column1     column2     column3     column4     column5
0   2019-06-07T14:54:17.230999464   1.234   2.665   0.888   3.891   3.762
1   2019-06-07T14:54:18.244998936   2.232   1.233   0.888   3.789   3.524

But the DateTime column seems to be a series instead of a date? I can`t apply something like:

df['DateTime'].isoformat()

How would I proceed?


Solution

  • That is correct, the DateTime column is a Series, not a date. You are trying to apply isoformat() to the entire column df['DateTime'], and it does not have this method.

    Each column of a DataFrame is a Series. Series have methods to deal directly with dates that you can access with .dt. Alternatively, you can apply a function to every element of the series with apply or map.

    It is not clear whether you want to hold the DateTime values as strings, each one the date in isoformat, or if you just want to print the dataframe with this format.

    If you want to hold a column with the dates converted to strings you could do one of these two:

    df['DateTime1'] = df['DateTime'].dt.strftime('%Y-%m-%dT%H:%M:%S.%f')
    
    # OR
    df['DateTime2'] = df['DateTime'].apply(lambda d: d.isoformat())
    

    If you just want to print them, you can also do this:

    df.style.format('{0:%Y-%m-%dT%H:%M:%S.%f}', subset=['DateTime'])
    
    # OR
    df.style.format(lambda d: d.isoformat(), subset=['DateTime'])