I have a python snippet which works just fine to run GLMNET on np.array
X and y. However, when X is a column sparse matrix from scipy, the code fails as rpy2 is not able to convert X. Am I making an obvious mistake?
A MCVE is:
import numpy as np
from scipy import sparse
from rpy2 import robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects import numpy2ri
from rpy2.robjects import pandas2ri
if __name__ == "__main__":
X = sparse.rand(5, 20, density=0.1)
y = np.random.randn(5)
numpy2ri.activate()
pandas2ri.activate()
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)
if not rpackages.isinstalled('glmnet'):
utils.install_packages("glmnet")
glmnet = rpackages.importr('glmnet')
glmnet = robjects.r['glmnet']
glmnet_fit = glmnet(X, y, intercept=False, standardize=False)
And when I run it I get a NotImplementedError
:
Conversion 'py2ri' not defined for objects of type '<class 'scipy.sparse.csc.csc_matrix'>'
Could I provide X in a different way? I'd be surprised if rpy2 could not handle sparse matrices.
You can create a sparse matrix with rpy2 as follows:
import numpy as np
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
from scipy import sparse
X = sparse.rand(5, 20, density=0.1).tocoo()
r_Matrix = importr("Matrix")
r_Matrix.sparseMatrix(
i=ro.IntVector(X.row + 1),
j=ro.IntVector(X.col + 1),
x=ro.FloatVector(X.data),
dims=ro.IntVector(X.shape))