I've searched extensively, but I can't find a solution that meets all these objectives at once:
I would like to be able to import only the data.frames into a list (i.e., without the other character objects nor the other formats), while selecting only their columns 3, 4, 5, and 6, and then rbind them all into a new, single object in the environment (i.e., no longer in a list).
Optional question: given the high number and large size of the data.frames, wouldn't it be better to convert the data.frames to data.tables first?
Thanks for help.
Sorry, but given the complexity of the case, I don't see how to provide a concrete example to test.
Assuming
paths = list.files('<path_to_top_level_folder>', pattern=".RData$", recursive=TRUE, full.names=TRUE)
data.frame
in each .RData
-file, and3:6
exist in each,you might want to start developing something robust from
## for explicity:
# result =
lapply(paths, \(i) {
load(i, i<-new.env())
d = get(Filter(\(x) is.data.frame(get(x, envir=i)), ls(i)), i)
d[3:6]
}) |> data.table::rbindlist() # |> do.call(what='rbind')
## streamlined:
# result =
lapply(paths, \(i) {
load(i)
get(Filter(\(x) is.data.frame(get(x)), ls()))[3:6]
}) |> data.table::rbindlist()
If the name of the data.frame
is always the same, this can be done even more concise. Subsetting by column names is less error-prone, but requires that all data frames have the same column names (w/o typos).
Note
data.table::rbindlist()
handles data.frame
objects just fine, and is quite fast.
> class(mtcars[3:6])
[1] "data.frame"
> dim(mtcars[3:6])
[1] 32 4
> list(mtcars[3:6], mtcars[3:6] * 2) |> data.table::rbindlist() |> dim()
[1] 64 4