pythonpandasresample

How to exclude working days in df.sample?


I have a df like this:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(100, 2),columns=['A', 'B'])
df['Date'] = pd.date_range("1/1/2000", periods=100)
df.set_index('Date', inplace = True)

I want to resample it by week and get the last value, but when I use this statement it returns results that actually include seven days of the week.

>>> df.resample('W').last()
                   A         B
Date                          
2000-01-02 -0.233055  0.215712
2000-01-09 -0.031690 -1.194929
2000-01-16 -1.441544 -0.206924
2000-01-23 -0.225403 -0.058323
2000-01-30 -1.564966 -1.409034
2000-02-06  0.800451 -0.730578
2000-02-13 -0.265631 -0.161049
2000-02-20  0.252658 -0.458502
2000-02-27  1.982499  3.208221
2000-03-05 -0.391827  0.927733
2000-03-12 -0.723863 -0.076955
2000-03-19 -1.379905  0.259892
2000-03-26 -0.983180  1.734662
2000-04-02  0.139668 -0.834987
2000-04-09  0.854117 -0.421875

And I only want the results for 5 days a week(not including Saturday and Sunday),That is, the returned date interval should be 5 but 7. Can pandas implement this? Or do I have to resort to some 3rd party calendar library?


Solution

  • To select only the weekdays you can use:

    df = df[df.index.weekday.isin(list(range(5)))]
    

    This will give you your DataFrame only including Monday to Friday. The job afterwards can keep the same.

    Comment Calling resample('W') will create the missing indexes. I belive your want to do something else.