I am trying to use Modin package to import a sparse matrix created with scipy (specifically, a scipy.sparse.csr_matrix).
Invoking the method:
from modin import pandas as pd
pd.DataFrame.sparse.from_spmatrix(mat)
I am getting the following AttributeError:
AttributeError Traceback (most recent call last)
C:\Users\BERGAM~1\AppData\Local\Temp/ipykernel_37436/3032405809.py in <module>
----> 1 pd.DataFrame.sparse.from_spmatrix(mat)
C:\Miniconda3\envs\persolite_v0\lib\site-packages\modin\pandas\accessor.py in from_spmatrix(cls, data, index, columns)
109 @classmethod
110 def from_spmatrix(cls, data, index=None, columns=None):
--> 111 return cls._default_to_pandas(
112 pandas.DataFrame.sparse.from_spmatrix, data, index=index, columns=columns
113 )
C:\Miniconda3\envs\persolite_v0\lib\site-packages\modin\pandas\accessor.py in _default_to_pandas(self, op, *args, **kwargs)
78 Result of operation.
79 """
---> 80 return self._parent._default_to_pandas(
81 lambda parent: op(parent.sparse, *args, **kwargs)
82 )
AttributeError: 'function' object has no attribute '_parent'
While using the original pandas API, it works.
Anyone with a similar problem? Thanks for the support
This is a bug. The code in this package uses a classmethod to call an instance method, and as a result the self
reference is not bound to the inference, but is instead a reference to the first argument (which here is a function).
This is the code that fails:
class BaseSparseAccessor:
def _default_to_pandas(self, op, *args, **kwargs):
return self._parent._default_to_pandas(
lambda parent: op(parent.sparse, *args, **kwargs)
)
class SparseFrameAccessor(BaseSparseAccessor):
@classmethod
def from_spmatrix(cls, data, index=None, columns=None):
return cls._default_to_pandas(
pandas.DataFrame.sparse.from_spmatrix, data, index=index, columns=columns
)
A quick example of why this fails follows:
class A:
_parent = 0
def a_method(self, op, **args):
self._parent = op(self._parent, **args)
class B(A):
@classmethod
def b_method(cls, data, **args):
return cls.a_method(sum, data, **args)
When you call b_method
(it doesn't matter if B is instantiated into an instance or not) it will fail, because self
in a_method
is the function sum
instead of the class or instance reference.
>>> B.b_method(20)
AttributeError Traceback (most recent call last)
<ipython-input-17-3914ce57d001> in <module>
----> 1 B.b_method(20)
<ipython-input-11-a25ce2c0614c> in b_method(cls, data, **args)
12 @classmethod
13 def b_method(cls, data, **args):
---> 14 return cls.a_method(sum, data, **args)
<ipython-input-11-a25ce2c0614c> in a_method(self, op, **args)
6
7 def a_method(self, op, **args):
----> 8 self._parent = op(self._parent, **args)
9
10 class B(A):
AttributeError: 'builtin_function_or_method' object has no attribute '_parent'