I'm working with a pandas Multiindex that is given by the three keys:
[Verbundzuordnung, ProjektIndex, Datum],
I would like to resample the dataframe on Datum hourly, which drops the right colum TagDesAbdichtens
, I would like to keep it as it's static.
Verbundzuordnung ProjektIndex Datum TagDesAbdichtens
1 81679 2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
... ... ... ...
2 94574 2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
285192 rows × 1 columns
There are aditional columns that I left out here for easier comprehension.
I am currently applying this to resample the dataframe
all_merged = all_merged.groupby([
pd.Grouper(level='Verbundzuordnung'),
pd.Grouper(level='ProjektIndex'),
pd.Grouper(level='Datum', freq='H')]
)
all_merged.mean() gives me the wanted output with TagDesAbdichtens
missing.
This value ist for each Verbundzuordnung and ProjektIndex unique and static and I would like to have it back in the resampled version.
Is there a way to do it with native pandas functions?
Instead of mean()
you can do the following
agg({'TagDesAbdichtens': 'first', 'another_col': 'mean', 'another_col2': 'mean', ... })
That is, you can specify a different aggregate function for each column.