I have a df like this:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 2),columns=['A', 'B'])
df['Date'] = pd.date_range("1/1/2000", periods=100)
df.set_index('Date', inplace = True)
I want to resample it by week and get the last value, but when I use this statement it returns results that actually include seven days of the week.
>>> df.resample('W').last()
A B
Date
2000-01-02 -0.233055 0.215712
2000-01-09 -0.031690 -1.194929
2000-01-16 -1.441544 -0.206924
2000-01-23 -0.225403 -0.058323
2000-01-30 -1.564966 -1.409034
2000-02-06 0.800451 -0.730578
2000-02-13 -0.265631 -0.161049
2000-02-20 0.252658 -0.458502
2000-02-27 1.982499 3.208221
2000-03-05 -0.391827 0.927733
2000-03-12 -0.723863 -0.076955
2000-03-19 -1.379905 0.259892
2000-03-26 -0.983180 1.734662
2000-04-02 0.139668 -0.834987
2000-04-09 0.854117 -0.421875
And I only want the results for 5 days a week(not including Saturday and Sunday),That is, the returned date interval should be 5 but 7. Can pandas implement this? Or do I have to resort to some 3rd party calendar library?
To select only the weekdays you can use:
df = df[df.index.weekday.isin(list(range(5)))]
This will give you your DataFrame only including Monday to Friday. The job afterwards can keep the same.
Comment
Calling resample('W')
will create the missing indexes. I belive your want to do something else.