I am really not sure how to phrase this concisely.. My question is: Is it possible to add an error handling feature so that if a data file (such as a csv) fails to load as a table/tibble, create a blank version of it?
Here is what I mean:
My normal csv load looks like this:
Monday2 <- paste0(my_file_location/my_file_name",Monday,".csv")
leads1 <- tibble(read.csv(Monday2))
Tuesday2 <- paste0("my_file_location/My_file_name",Tuesday,".csv")
leads2 <- tibble(read.csv(Tuesday2))
Wednesday2 <- paste0("my_file_location/my_file_name",Wednesday,".csv")
leads3 <- tibble(read.csv(Wednesday2))
If for some reason my csv failed to load (the file doesn't exist, or I entered the name incorrectly for example) can a blank version of it be created?
My idea for the blank tibble would look like this:
Leads21 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
Leads22 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
Leads23 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
This blank tibble would be the exact same columns as a properly loaded file. I have 5 files I bind each Friday in an automated process.. and if a file fails to load I can catch it downstream in my process (one of the columns is the file name/date) but I don't want the whole process to fail.
a typical 'failed to load' error looks like this:
In file(file, "rt") : cannot open file 'my_file_location/My_file_name_2022-03-27.csv': No such
file or directory
The bind of all 5 files then fails with an error message like:
### Join full weeks worth of leads into 1 file
Leads <- bind_rows(leads1,leads2,leads3, leads4, leads5)
Error in list2(...) : object 'leads1' not found
This then causes the rest of my code to fail/act incorrectly. If I can bind an empty tibble, my code could finish running and I can check for missing files at the end. Ultimately if a file is missing it is not as important as processing the existing files (so stopping my code to locate/fix the failed load is not important)
My background is in microsoft access VBA and I keep trying to write something like:
If tibble Leads1 exists, use it.. If tibble Leads1 does not exist use Leads21
not sure how to do this in R. I have been trying to read/understand the try() wrapper, but I don't understand how to use it in my case.
Here is how I ended up working this issue out. Not sure it is the most elegant, but it does the job.
I create the blank tibble first. I create a blank tibble for each file that I am loading (scalability is terrible with this method); then I read each .csv file. If a .csv file fails to load then my blank tibble is not replaced and will be available for the bind I do at the end.
leads1 <- tibble("Column_1" = "", "Column_2" = "")
leads1$filename <- Monday
Monday2 <- paste0("my_file_location/File_name_part1",Monday,".csv")
leads1 <- tibble(read.csv(Monday2, colClasses = 'character'))
leads1$filename <- Monday
Leads <- bind_rows(leads1,leads2,leads3, leads4, leads5) %>% distinct(Column1, .keep_all = TRUE)