I have downloaded some Rdata
files using getMasterIndex
function from edgar
Now I am trying to load all of these files into RStudio using the following code -
paths <- list.files('Master Indexes', pattern = '[.]Rda$', full.names = TRUE)
files <- map (paths, load)
The output of the files
dataset is, but there should be data in it.
[1] "year.master"
[1] "year.master"
[1] "year.master"
The output of the code list_rbind(files)
is -
Error in `list_rbind()`:
! Each element of `x` must be either a data frame or `NULL`.
ℹ Elements 1, 2, and 3 are not.
Run `rlang::last_trace()` to see where the error occurred.
However, the last Rda
file is loaded in RStudio
with the name being year.master
I have also used the for loop
function, but the results remain the same.
I tried to take help from this page, but it does not work - Using purrr to load multiple rda files
My goal is to put all of the Rda files into a list and then convert it into a dataframe.
Use this:
map(paths, ~ {load(.x); year.master})
# or map_dfr, if you want a dataframe as an output instead of a list
Okay so first of all: clear your environment (or save your environment, and start a fresh one). If you're anything like me, then you have a lot of things in there that it makes it hard to see what is loaded.
Then run this code:
pacman::p_load(edgar, tidyverse)
useragent <- "Your Name Contact@domain.com"
getMasterIndex(2006, useragent)
getMasterIndex(2022, useragent)
paths <- dir("Master Indexes/", full.names = TRUE) |> grep(pattern = "\\.Rda", value = TRUE)
example <- load(paths[1])
files <- map(paths, ~ load(.x, .GlobalEnv))
Afterwards, you'll see a few things in your environment:
So you can see, even though (it seems) you were trying to load the files as
the object files
, they aren't actually saved there, they're saved under another name, "year.master", and the function returns that name. It appears that the R objects are loaded with their (presumably original) name.
From the documentation:
replaces all existing objects with the same names in the current environment (typically your workspace, .GlobalEnv) and hence potentially overwrites important data. It is considerably safer to useenvir =
to load into a different environment, or toattach(file)
which load()s into a new entry in the search path.
In other words, because they all have the same name, running map(paths, ~ load(.x, .GlobalEnv))
will load all of them, but you'll only get the last one, because every one after the first will overwrite the one that came before it.