rfor-loop

Filtering numeric data within a for-loop in R


I've got a series of seven (non-sequential) numbers and a large dataset, and I want to filter this data 7 times according to these 7 numbers, and then save them in a list. I would like the value column to be less than or equal to the value of the given number, vals, in each case.

The dataset looks something like this:

   subj <- c("A1", "A1", "A2", "A2", "A3", "A3", "A4", "A4", "A5", "A5")
   var1 <- c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)
   value <- c(.73, .2, .51, .45, .18, .43, .62, .02, 0, .11) 

   df <- data.frame(subj, var1, value)

And the code I've been working with looks like this:

 vals <- c(0.15, 0.18, 0.19, 0.21, 0.24, 0.33, 0.50)
 output <- vector("list", length(7))

 for (i in 1:length(vals)) {   
   new_data <- df %>%   
     filter(value <= i)        
   output[[i]] <- new_data     
 }

This runs fine but it doesn't filter the data, so I end up with a list where each of the 7 elements is exactly the same. I would like each of the 7 list elements to include only the rows where value <= i, and to discard those where value > i.

I'd also like the name of each element in the list to be the respective value from vals, if possible, though I can't figure out how to do this with numeric data.


Solution

  • Please try

    lapply(vals, \(i) subset(df, value <= i))
    

    This gives a list of data frames. Each data frame is subsetted in correspondence to a value of vals.

    If you like to use your explicit for-loop approach, then change to

    library(dplyr)
    output <- vector("list", length = length(vals))
    for (v in seq(vals)) {  
      new_data <- df %>% 
        filter(value <= vals[[v]])
      output[[v]] <- new_data 
    }
    

    We can do this in base R as well:

    output <- vector("list", length = length(vals))
    for (v in seq(vals)) {  
      output[[v]] <- df[value <= vals[[v]], ] 
    }