I've got a series of seven (non-sequential) numbers and a large dataset, and I want to filter this data 7 times according to these 7 numbers, and then save them in a list. I would like the value
column to be less than or equal to the value of the given number, vals
, in each case.
The dataset looks something like this:
subj <- c("A1", "A1", "A2", "A2", "A3", "A3", "A4", "A4", "A5", "A5")
var1 <- c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)
value <- c(.73, .2, .51, .45, .18, .43, .62, .02, 0, .11)
df <- data.frame(subj, var1, value)
And the code I've been working with looks like this:
vals <- c(0.15, 0.18, 0.19, 0.21, 0.24, 0.33, 0.50)
output <- vector("list", length(7))
for (i in 1:length(vals)) {
new_data <- df %>%
filter(value <= i)
output[[i]] <- new_data
}
This runs fine but it doesn't filter the data, so I end up with a list where each of the 7 elements is exactly the same. I would like each of the 7 list elements to include only the rows where value <= i
, and to discard those where value > i
.
I'd also like the name of each element in the list to be the respective value from vals
, if possible, though I can't figure out how to do this with numeric data.
Please try
lapply(vals, \(i) subset(df, value <= i))
This gives a list of data frames. Each data frame is subsetted in correspondence to a value of vals
.
If you like to use your explicit for-loop approach, then change to
library(dplyr)
output <- vector("list", length = length(vals))
for (v in seq(vals)) {
new_data <- df %>%
filter(value <= vals[[v]])
output[[v]] <- new_data
}
We can do this in base R
as well:
output <- vector("list", length = length(vals))
for (v in seq(vals)) {
output[[v]] <- df[value <= vals[[v]], ]
}