I have dataframes (stored in excel files) of data for a single participant each of which look like
df1 = pd.DataFrame([['15:05', '15:06', '15:07', '15:08'], [7.333879016553067, 8.066897471204006, 7.070168678977272, 6.501888904228463], [64.16712081101915, 65.08486717007806, 67.22483766233766, 64.40328265521458],
[114.21879259980525, 116.49792952572476, 113.26931818181818, 108.35424424108551]]).T
df1.columns = ['Start', 'CO', 'Dia', 'Sys']
Start | CO | Dia | Sys | |
---|---|---|---|---|
0 | 15:05 | 7.33388 | 64.1671 | 114.219 |
1 | 15:06 | 8.0669 | 65.0849 | 116.498 |
2 | 15:07 | 7.07017 | 67.2248 | 113.269 |
3 | 15:08 | 6.50189 | 64.4033 | 108.354 |
and I need to unstack
it into 1 row so that I can then read all the different participants into a single dataframe. I have tried using the answer to this question, and the answer to this question to get something like this (a multiindexed dataframe)
Time 1 | Time 2 |
---|
CO | Dia | Sys | CO | Dia | Sys | |
---|---|---|---|---|---|---|
0 | 7.33388 | 64.1671 | 114.219 | 8.0669 | 65.0849 | 116.498 |
But what I'm ending up with is
('15:05', 'CO') | ('15:05', 'Dia') | ('15:05', 'Sys') | ('15:06', 'CO') | ('15:06', 'Dia') | ('15:06', 'Sys') | |
---|---|---|---|---|---|---|
0 | 7.33388 | 64.1671 | 114.219 | nan | nan | nan |
1 | nan | nan | nan | 8.0669 | 65.0849 | 116.498 |
So as you can see, each minute is still a new row but now they are arranged in an even less useful way.
Can anyone offer advice?
Assuming that each row is Time 0
, Time 1
, etc. We can use the index for our top level in the MultiIndex
# convert index to string and add "Time "
df1.index = "Time " + df1.index.astype(str)
Then groupby the index, take the max (or some other aggregate that keeps the original values) of all columns besides "Start" (0th element), stack, convert back to a frame, and transpose
out = df1.groupby(df1.index)[df1.columns[1:]].max().stack().to_frame().T