pythonpandas

Flatten Data By ID


I have a data table that looks like this (note this is condensed, there are some ID with 2+ TIMEs):

ID        TIME
1         10:12
1         4:43
2         9:12
3         1:43
3         16:47
4         2:55

I would like to flatten it so it looks like this:

ID  Time_1    Time_2
1     10:12    4:43
2     4:43
3     1:43     16:47
4     2:55

I'm reviewing 'flattening' options but have not yet found the correct answer.

https://pandas.pydata.org/pandas-docs/version/0.14.1/generated/pandas.Index.flatten.html


Solution

  • To format to exactly the required layout, you can use .pivot() as follows:

    (df.assign(serial='Time_'+df.groupby('ID').cumcount().add(1).astype(str))
       .pivot(index='ID', columns='serial', values='TIME')
       .rename_axis(columns=None)
       .reset_index()
    )
    

    Result:

       ID Time_1 Time_2
    0   1  10:12   4:43
    1   2   9:12    NaN
    2   3   1:43  16:47
    3   4   2:55    NaN
    

    If you want the missing values to show as blank, you can use:

    (df.assign(serial='Time_'+df.groupby('ID').cumcount().add(1).astype(str))
       .pivot(index='ID', columns='serial', values='TIME').fillna('')
       .rename_axis(columns=None)
       .reset_index()
    )
    

    Result:

       ID Time_1 Time_2
    0   1  10:12   4:43
    1   2   9:12       
    2   3   1:43  16:47
    3   4   2:55