pythonpandasdatetimegroup-bypandas-resample

Pandas resample drops (static) datetime column, how do I keep it?


I'm working with a pandas Multiindex that is given by the three keys:
[Verbundzuordnung, ProjektIndex, Datum],

I would like to resample the dataframe on Datum hourly, which drops the right colum TagDesAbdichtens, I would like to keep it as it's static.

            
Verbundzuordnung    ProjektIndex    Datum                           TagDesAbdichtens
1                   81679           2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
...     ...     ...     ...
2                   94574           2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31

285192 rows × 1 columns

There are aditional columns that I left out here for easier comprehension.

I am currently applying this to resample the dataframe

all_merged = all_merged.groupby([
    pd.Grouper(level='Verbundzuordnung'), 
    pd.Grouper(level='ProjektIndex'), 
    pd.Grouper(level='Datum', freq='H')]
  )

all_merged.mean() gives me the wanted output with TagDesAbdichtens missing. This value ist for each Verbundzuordnung and ProjektIndex unique and static and I would like to have it back in the resampled version.

Is there a way to do it with native pandas functions?


Solution

  • Instead of mean() you can do the following

    agg({'TagDesAbdichtens': 'first', 'another_col': 'mean', 'another_col2': 'mean', ... })
    

    That is, you can specify a different aggregate function for each column.