[SOLVED] Window of full weeks in pandas

Window of full weeks in pandas

I am looking for a special window function in pandas: sort of a combination of rolling and expanding. For calculating (for instance) the mean and standard deviating, I want to regard all past data, but ignore the first few records to make sure I have a multiple of 7 (days in my case). That's because I know the data has a strong weekly pattern.

Example:

s = pd.Series([1, 3, 4, 5, 4, 3, 1, 2, 4, 5, 4, 5, 4, 2, 1, 3, 4, 5, 4, 3, 1, 3],
              pd.date_range('2020-01-01', '2020-01-22'))
s.rolling(7, 7).mean()   # Use last 7 days.
s.expanding(7).mean()    # Use all past days.
s.mywindowing(7).mean()  # Use last past multiple of 7 days. How?

The effect should be like this:

Of course I can do things manually using for loops and such, but I imagine the existing pandas machinery can be used to do this...?

Solution

Pandas custom window rolling

another usage here

import pandas as pd
import numpy as np
from pandas.api.indexers import BaseIndexer
from typing import Optional, Tuple


class CustomIndexer(BaseIndexer):
    def get_window_bounds(self,
                          num_values: int = 0,
                          min_periods: Optional[int] = None,
                          center: Optional[bool] = None,
                          closed: Optional[str] = None
                          ) -> Tuple[np.ndarray, np.ndarray]:

        end = np.arange(1, num_values+1, dtype=np.int64)
        start = end % 7
        return start, end

indexer = CustomIndexer(num_values=len(s))
s.rolling(indexer).mean().round(2)

Outputs:

2020-01-01     NaN
2020-01-02     NaN
2020-01-03     NaN
2020-01-04     NaN
2020-01-05     NaN
2020-01-06     NaN
2020-01-07    3.00
2020-01-08    3.14
2020-01-09    3.29
2020-01-10    3.43
2020-01-11    3.29
2020-01-12    3.43
2020-01-13    3.57
2020-01-14    3.36
2020-01-15    3.36
2020-01-16    3.36
2020-01-17    3.36
2020-01-18    3.36
2020-01-19    3.36
2020-01-20    3.36
2020-01-21    3.24
2020-01-22    3.33
Freq: D, dtype: float64