pythonranalyticsrpy2

How to import a function from an R package as if it was native Python function and use all its outputs?


There is a function called dea(x, y, *args) in library(Benchmarking) which returns useful objects. I've described 3 key ones below:

crs = dea(mydata_matrix_x, my_data_matrix_y, RTS="IN", ORIENTATION= "in") # both matrixes have N rows

efficiency(crs) # a 'numeric' type object which looks like a 1xN vector
peers(crs) # A matrix: Nx2 (looks to me like a pandas dataframe when run in .ipynb file with R kernel)
lambda(crs) # A matrix: Nx2 of type dbl (also looks like a dataframe)

Now I would like to programatically vary my_data_matrix_x. This matrix represents my inputs. At first it will be a Nx10 matrix. However I intend to drop each column sequentially and run dea() on the Nx9 matrix, then graph the efficiency(crs) scores that come out. The issue is I have no idea how to achieve this in R (amongst other things) and would rather circumvent the issue by writing all my code in Python and importing this dea() function somehow from an R script

I believe the best solution available to me will be to read and write from files:

from Benchmarking_script.r import dea

def test_inputs(data, input):
    INPUTS = ['input 1', 'input2', 'input3', 'input4,' 'input5']
    OUTPUTS = ['output1', 'output2']
    data_inputs = data.drop(f"{input}", axis=1)
    data_outputs = data[OUTPUTS]

    data_inputs.to_csv("my_inputs.csv")
    data_outputs.to_csv("my_outputs.csv")

    run Benchmarking.dea(data_inputs, data_outputs, RTS="crs", ORIENTATION="in")

clearly this last line won't work: I am interested to hear flexible (and simple!) ways to run this dea() function idiomatically as if it was a native Python function

Related SO questions

The closest answer on SO I've found has been Importing any function from an R package into python

When adapting the code I've written

import pandas as pd
data = pd.read_csv("path/to_data.csv")


import rpy2
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects.packages import importr
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)

packnames = ('Benchmarking')
utils.install_packages(StrVector(packnames))

Benchmarking = importr('Benchmarking')

crs = Benchmarking.dea(data['Age'], data['CO2'], RTS='crs', ORIENTATION='in')

--------------------------------------------------------------
NotImplementedError: Conversion 'py2rpy' not defined for objects of type '<class 'pandas.core.series.Series'>'

So importing the function natively as a Python file hasn't worked


Solution

  • The second approach is the way to go. You need to use a converter context so python and r variables would be converted automatically. Specifically, try pandas2ri submodule shipped with rpy2. Something like this:

    from rpy2.robjects import pandas2ri
    
    with pandas2ri:
        crs = Benchmarking.dea(data['Age'], data['CO2'], RTS='crs', ORIENTATION='in')
    

    If this doesn't work, update your post with the error.