ruby-on-railsrubyrfastercsv

how to pass unknown number of arguments to a function in R programming


I am parsing a csv with multiple columns. The number of columns is not fixed in the csv file. It varies from 5 to 10. I need to recreate a data.frame with these columns inside a function. I am wondering if there is any multiple arguments functionality in R like one in Ruby(*args). If not, How to achieve this??? I searched a bit and found that if I have a col name as

col1
col2

I can use:

list <- ls(pat="^col\\d$")

and pass this list as an argument to a function, but it will pass just column names, as characters, not the values these column names are carrying.

Any suggestions????

Edit: I am parsing a file from RoR app and using RinRuby gem to call R functions. So parsing a csv from ruby and passing individual column contents as a single variable in R. Now in R, I need to create a data.frame. So actually its not a data frame originally. So in the method cal_norm below I am assigning variables in R using a loop with names col1, col2, col3....and so on.

here is the rails code:

 class UploadsController < ApplicationController

  attr_accessor :calib_data, :calib_data_transpose, :inten_data, :pr_list

  def index
    @uploads = Upload.all

    @upload = Upload.new

  respond_to do |format|
  format.html 
  format.json { render json: @uploads }   
  end
 end

 def create
  @upload = Upload.new(params[:upload]) 

 directory = "public/"
 io_calib = params[:upload][:calib]
 io_inten = params[:upload][:inten]   

 name_calib = io_calib.original_filename
 name_inten = io_inten.original_filename
 calib_path = File.join(directory, "calibs", name_calib)
 inten_path = File.join(directory, "intens", name_inten)

respond_to do |format|
  if @upload.save
    @calib_data, @calib_data_transpose = import(calib_path)
    @inten_data = import_ori(inten_path)
    #probe list of the uploaded file
    @probe_list = calib_data_transpose[0]
    logger.debug @probe_list.to_s
    flash[:notice] = "Files were successfully uploaded!!"
    format.html
    #format.js #{ render json: @upload, status: :created, location: @upload }
  else
    flash[:notice] = "Error in uploading!!"
    format.html { render action: "index" }
    format.json { render json: @upload.errors, status: :unprocessable_entity }
    end
  end
 end

def cal_norm
   #ajax request
   data = params['data'].split(',') 

  for i in 0..@calib_data_transpose.length - 1
  R.assign "col#{i}", @calib_data_transpose[i] 
  end

  R.assign "cells", @inten_data
  R.assign "pr", data
  R.eval <<-EOF

# make sure to convert them in character and numeric vectors

#match the selected pr in the table

#convert the found row of values from data.frame to numeric

#divide each column of the table by the respective pr values and create a new table repat it with different pr.

#make a new table with the ce count and different probe normalization and calculate  for individual pr

#finally return a data.frame with pr names and cell counts

#return individual columns as an array not in the form of matrix/data.frame

EOF

end

def import(file_path)
  array = import_ori(file_path)
  array_splitted = array.map {|a| a.split(",")} 
  array_transpose = array_splitted.transpose
  return array_splitted, array_transpose
end

 def import_ori(file_path)
  string = IO.read(file_path)
  array = string.split("\n")
  array.shift
  return array
 end

end

Solution

  • Post updated question:

    I am utter newbie of Ruby but found this example here: col wise data

    Here column wise data is read into col_data, the 0 here is the (col) index (no Ruby for testing :( )

    require 'csv'
    col_data = []
    CSV.foreach(filename) {|row| col_data << row[0]}
    

    Assign the col data to a variables col1...coln, and create a counter for number of columns (syntax might not be 100% correct)

    for i in 0..@calib_data_transpose.length - 1
     #R.assign "col#{i}", @calib_data_transpose[i] 
     CSV.foreach(filename) {|row| "col#{i}" << row[i]}
    end
    
    R.col_count=@calib_data_transpose.length - 1
    

    And once col1..coln are created, combine the column data one index at a time starting at i = 1. The result will a data.frame with order of columns as col1.... coln.

    R.eval <<-EOF
    
    for(i in 1:col_count) { 
      if (i==1) { 
       df<-data.frame(get(paste0("col",i))) 
      } 
      else { 
       df<-cbind(df,get(paste0("col",i))) 
     } 
    
     names(df)[i]<-paste0("col",i)
    }
    
    EOF
    

    Let us know if this helps...


    Not very relevant to updated question anymore but retaining it for posterity.

    Subset data.frame for a given pattern

    As Roland stated above read.csv will read the entire file, since you wish to control which columns are retained in the data.frame you could do the following:

    Using data(mtcars) as sample data.frame

    Code:

    Read in the data:

    > data(mtcars)
    > head(mtcars)
                       mpg cyl disp  hp drat    wt  qsec vs am gear carb
    Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
    Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
    Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
    Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
    Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
    Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
    

    Subset the data for some condition, say columns beginning with alphabet 'c'

    > head(mtcars[,grep("^c",colnames(mtcars))])
                       cyl carb
    Mazda RX4           6    4
    Mazda RX4 Wag       6    4
    Datsun 710          4    1
    Hornet 4 Drive      6    1
    Hornet Sportabout   8    2
    Valiant             6    1
    

    Here '^c' is similar to the pattern pat="^col\\d$" from your question. You could substitute '^c' with any regular expression of your choice e.g '^col'.The '^c' will match any pattern beginning with alphabet 'c', to match at the end of the string use '$c'