This code reads a xlsx file and creates individualy named files based on sheet number and a value found at a particular location (in this case temp[2,1]). However because each file and sheet is slightly different the names are inconsistant.
sheet_to_read <- c(11,12,13,14)
for( excelsheet in files) {
for (sheet in sheet_to_read) {
temp <- read_excel( path = excelsheet, sheet = sheet, col_names = FALSE)
write.csv( temp, file = paste0( "./", gsub("./", "", excelsheet), temp[2,1], sheet,".csv") )
}}
I would like is a way of naming the files with TRUE or FALSE if a specific character string is present any where within the sheet, in this case 'vivax'. So in this case:
GBD2016_2_915_Boletin Epidemiologico_2016_37sheet21true.xls
GBD2016_2_915_Boletin Epidemiologico_2016_37sheet22false.xls`
Example file: https://drive.google.com/file/d/1p4HAuFl7Codine1Vvb8SzA7OHTzraaHz/view?usp=sharing
Since you have a tibble, and it is not known which column has which type, I created this:
isWordInTibble <- function(word, tibble) {
# case insensitive
any(unlist(
sapply(1:ncol(tibble),
function(i) tolower(word) %in% tolower(as.character(unlist(tibble[, i]))))))
}
It looks whether the word is in any of the column vectors - looks through all columns.
Replace your file argument in your write.csv
command by:
file = gsub(".xls",
paste0(substr(temp[2, 1],
1,
5), # just first 5 letters
gsub("sheet", "", sheet),
substr(tolower(as.character(isWordInTibble("vivax", tmp))),
1,
1), # just first letter ("t" or "f")
".csv"),
excelsheet)
Then it might work!
I tried to shorten the names using substr(, start, end)
and gsub()
.
Appendix
Since you asked how to print then only the files containing the word but not those which don't contain it:
In your example, instead of the write.csv()
command, replace it with:
if (isWordInTibble("vivax", tmP)) {
write.csv(temp,
file = gsub(".xls",
paste0(substr(temp[2, 1],
1,
5), # just first 5 letters
gsub("sheet", "", sheet),
substr(tolower(as.character(isWordInTibble("vivax", temp))),
1,
1), # just first letter ("t" or "f")
".csv"),
excelsheet))
}
Then it prints out only if the isWordInTibble
returns TRUE
.