I would like to construct an extension of pandas.DataFrame
— let's call it SPDF
— which could do stuff above and beyond what a simple DataFrame
can:
import pandas as pd
import numpy as np
def to_spdf(func):
"""Transform generic output of `func` to SPDF.
Returns
-------
wrapper : callable
"""
def wrapper(*args, **kwargs):
res = func(*args, **kwargs)
return SPDF(res)
return wrapper
class SPDF:
"""Special-purpose dataframe.
Parameters
----------
df : pandas.DataFrame
"""
def __init__(self, df):
self.df = df
def __repr__(self):
return repr(self.df)
def __getattr__(self, item):
res = getattr(self.df, item)
if callable(res):
res = to_spdf(res)
return res
if __name__ == "__main__":
# construct a generic SPDF
df = pd.DataFrame(np.eye(4))
an_spdf = SPDF(df)
# call .diff() to obtain another SPDF
print(an_spdf.diff())
Right now, methods of DataFrame
that return another DataFrame
, such as .diff()
in the MWE above, return me another SPDF
, which is great. However, I would also like to trick chained methods such as .resample('M').last()
or .rolling(2).mean()
into producing an SPDF
in the very end. I have failed so far because .rolling()
and the like are of type callable
, and my wrapper to_spdf
tries to construct an SPDF
from their output without 'waiting' for .mean()
or any other last part of the expression. Any ideas how to tackle this problem?
Thanks.
You should be properly subclassing dataframe
. In order to get copy-constructor
methods to work, pandas describes that you must set the _constructor
property (along with other information).
You could do something like the following:
class SPDF(DataFrame):
@property
def _constructor(self):
return SPDF
If you need to preserve custom attributes
(not functions
- those will be there), during copy-constructor
methods (like diff
), then you can do something like the following
class SPDF(DataFrame):
_metadata = ['prop']
prop = 1
@property
def _constructor(self):
return SPDF
Notice the output is as desired:
df = SPDF(np.eye(4))
print(type(df))
[<class '__main__.SPDF'>]
new = df.diff()
print(type(new))
[<class '__main__.SPDF'>]