I have a pandas series, y, whose index
column is:
Index(['2020-01-03', '2020-01-07', '2020-01-10', '2020-01-14', '2020-01-17',
...
'2023-07-25', '2023-07-28', '2023-08-01', '2023-08-04', '2023-08-08'],
dtype='object', name='Date', length=376)
As you can see pandas cannot infer the frequency of the series (there is no freq
attribute to the y.index
object besides dtype
, name
, and length
)!
Question:
How can I force a frequency attribute such as freq='C', weekmask='1001000'
to the series so that when I do y.index.freq
I get the frequency of the series back?
What I have tried:
Here is where I am creating the series to begin with:
def ensureSeriesHasAttributes(y: object, series_name: str) -> pd.Series:
if not isinstance(y, pd.Series):
y = pd.Series(y, name=series_name)
else:
y.name = series_name
return y
When I change the above function to the following, I get AssertionError
:
def ensureSeriesHasAttributes(y: object, series_name: str) -> pd.Series:
if not isinstance(y, pd.Series):
y = pd.Series(y, name=series_name, freq='C', weekmask='1001000')
else:
y.name = series_name
y.index.freq = 'C'
y.index.weekmask = '1001000'
return y
How can I force a frequency attribute such as freq='C', weekmask='1001000' to the series?
You can try:
# Custom frequency with a valid weekmask related to your index
cfreq = pd.offsets.CustomBusinessDay(weekmask='0100100')
# Convert Index to DatetimeIndex and apply the custom frequency
y = y.set_axis(pd.to_datetime(y.index)).asfreq(cfreq)
Output:
>>> y
2020-01-03 0
2020-01-07 1
2020-01-10 2
2020-01-14 3
2020-01-17 4
Freq: C, Name: TS, dtype: int64
>>> y.index
DatetimeIndex(['2020-01-03', '2020-01-07', '2020-01-10', '2020-01-14',
'2020-01-17'],
dtype='datetime64[ns]', freq='C')
>>> y.index.freq
<CustomBusinessDay>
>>> y.index.freq.weekmask
'0100100'
Minimal Working Example:
import pandas as pd
idx = ['2020-01-03', '2020-01-07', '2020-01-10', '2020-01-14', '2020-01-17']
y = pd.Series(data=range(len(idx)), index=idx, name='TS')
>>> y
2020-01-03 0
2020-01-07 1
2020-01-10 2
2020-01-14 3
2020-01-17 4
Name: TS, dtype: int64
>>> y.index
Index(['2020-01-03', '2020-01-07', '2020-01-10', '2020-01-14', '2020-01-17'],
dtype='object')
There are some errors and ambiguities in your code especially when y
is not a Series
.