rrevolution-r

Referencing variable value as a column for the rowSelection argument in the rxDataStepXdf function


I have assigned a variable to take the column name of the dataset, say:

column_name <- "run_type"

Using the rxDataStepXdf function, i would like to filter my dataset to select only rows where run_type = "Prime" :

rxDataStepXdf(inFile=datasetXDFPath, outFile=outputXDFPath,rowSelection=(run_type=="Prime"))

However, instead of explicitly specifying the column to filter, I need to pass the variable column_name instead :

rxDataStepXdf(inFile=datasetXDFPath,outFile=outputXDFPath,rowSelection=(column_name=="Prime"))

This does not work as I'm guessing the function searches for a column whose name is "column_name" instead. I've tried the following ways and all of them didn't work for me:

rxDataStepXdf(inFile=datasetXDFPath,outFile=outputXDFPath,rowSelection=(quote(column_name)=="Prime"))

rxDataStepXdf(inFile=datasetXDFPath,outFile=outputXDFPath,rowSelection=(get("column_name")=="Prime"))

rxDataStepXdf(inFile=datasetXDFPath,outFile=outputXDFPath,rowSelection=(eval(column_name)=="Prime"))

rxDataStepXdf(inFile=datasetXDFPath,outFile=outputXDFPath,rowSelection=(eval(parse(text="column_name"))=="PRIME"))

How do I pass the value of column_name into the rowSelection argument?


Solution

  • You can create your expression outside of the call to rxDataStep which makes it a bit easier to read. Then, one option is to use parse as you had with a bit of a change to the syntax you had.

    rowExpr <- parse(text=paste(column_name,"=='PRIME'"))
    rxDataStepXdf(inFile=datasetXDFPath, outFile=outputXDFPath, rowSelection= rowExpr)
    

    Another option is to use transformFunc and pass the column_name as a transfFormObject.

    rowXform <- function(dataList) {
      dataList$.rxRowSelection <- dataList[[selCol]] == 'PRIME'
      return(dataList)
    }
    rxDataStep(inFile=datasetXDFPath, outFile=outputXDFPath, 
               transformObject = list(selCol = column_name))