rdataframes-plus

iteratively constructed dataframe in R


I'm relatively new to R, and was wondering the most efficient way to iteratively construct a dataframe (one row at a time, the number of iterations "n" and the length of each row "l" are known beforehand).

  1. Create empty dataframe, add a row each iteration
  2. Preallocate n x l dataframe, modify a row each iteration
  3. Preallocate n x l matrix, modify a row each iteration, make dataframe from matrix
  4. Something else

Solution

  • Pre-allocate!!!

    And use a matrix if the data are all the same type. It will be much faster than a data.frame.

    For example:

    > n <- 1000      # Number of rows
    > row <- 1:20*1  # one row
    > 
    > # Adding row, one-by-one
    > Data <- data.frame()
    > system.time(for(i in 1:n) Data <- rbind(Data,row))
       user  system elapsed 
       2.18    0.00    2.18 
    > 
    > # Pre-allocated data.frame
    > Data <- as.data.frame(Data)
    > system.time(for(i in 1:n) Data[i,] <- row)
       user  system elapsed 
       0.94    0.00    0.93
    >
    > # Pre-allocated matrix (fast!)
    > Data <- as.matrix(Data)
    > system.time({ for(i in 1:n) Data[i,] <- row; Data <- as.data.frame(Data) })
       user  system elapsed 
          0       0       0