I have written a small python package which extends a pandas dataframe with a few additional methods.
At the moment, I have this code in my package:
def init():
@pd.api.extensions.register_dataframe_accessor("test")
class _:
def __init__(self, pandas_obj):
self._obj = pandas_obj
def myMethod(self):
pass
I then do the following in python:
import pandas as pd
import mypackage as mp
mp.init()
test = pd.Dataframe(<define data frame>)
test.mp.myMethod()
My question is, is it possible to do the pandas import and register the accessor from within the __init__.py
in mypackage, so that once mypackage is imported, I automatically have access to mymethod without the init() step? My current approach feels a bit clunky...
I might be missing something in your question, but I think you might be barking up the wrong tree. There's nothing special about __init__.py
in this regard--anything you write in __init__.py
is executed when you import the package, so I don't think you need that init()
function at all. If you have a file containing:
# mypackage/__init__.py
import pandas as pd
@pd.api.extensions.register_dataframe_accessor("test")
class _:
def __init__(self, pandas_obj):
self._obj = pandas_obj
def myMethod(self):
print(self._obj)
Now you can just use it by importing mypackage
like:
>>> import pandas as pd
>>> import mypackage
>>> df = pd.DataFrame({'a': [1, 2, 3]})
>>> df.test.myMethod()
a
0 1
1 2
2 3
As an aside, one reason you might explicitly want something like your init()
function is principle of least surprise: Since register_dataframe_accessor
modifies the namespace of DataFrame
instances for all users (including other libraries) there is a small possibility that your register_dataframe_accessor
, just by importing your package, might override some other package's dataframe acccessor if they happen to share the same name.
If the name is reasonably unique this may not be a problem though. It also may simply not be a problem for your package depending on how it's intended to be used.