I have a folder that includes hundreds of txt files with specific financial data for each company code/year/month combination (for specific companies that are designated by the four digit number in the front of the file). I need a way to loop through all these files and create an xlsx for each one for audit purposes.
For example, the folder will include txt files like this:
8800-2015-01.txt
8800-2015-02.txt
......and many more like this......
All files have the exact same columns, column names, etc. I need to read them in and create xlsx files for each one and save them to the same folder.
In the example above, this folder would then include the txt and xlsx files like this:
8800-2015-01.txt
8600-2015-02.txt
8800-2015-01.xlsx
8800-2015-02.xlsx
I can write code to list out all the files for specific patters using this:
library(writexl)
setwd(path)
list = list.files(pattern = flob2rx('*8800*.txt*')
Then could do something like this:
txt_files <- lapply(list, function(x) read_delim(x,
"|",
escape_double = FALSE,
trim_ws = TRUE,
skip = 6))
But how would you then loop through keeping the same naming convention for each file and writing all the xlsx to the same folder?
Editing to show what I did. Solutions provided gave me ideas and lead to final of what I used. I also had to do some filtering and select specific columns which wasn't in original post.
library(writexl)
setwd(path)
file_list = list.files(pattern = glob2rx('*880*.TXT*'))
for (i in file_list){
write_xlsx(
filter(
select(
read_delim(i,
"|",
skip = 6,
escape_double = FALSE,
trim_ws = TRUE),
"col1","col2",......),
!is.na(col1) & col2 != 'something'),
paste0(unlist(strsplit(i,"\\."))[1], ".xlsx"))
}
I'm going to assume your files have the columns separated with tab, that is, '\t' in R.
For writing xlsx you're goint o need install.packages("xlsx")
. I'm not familiar with writexl
then you can just:
library(xlsx)
setwd(path)
files=dir(pattern="*.txt")
for(i in files){
write.xlsx(read.csv(i,sep="\t"),paste0(unlist(strsplit(i,"\\."))[1],".xlsx"))
}