I am trying to use the base R function list.files()
to return a list of file names (.txt files) so that I can import them in an automated way instead of writing a read.table()
line per file.
Here is the problem that I'm encountering:
Using the list.files()
function, I get the following output:
> list.files(pattern = "\\.txt$")
[1] "~$ DD - maskedfiletitle.txt"
[2] "~$ KP - maskedfiletitle.txt"
[3] "TF DD - maskedfiletitle.txt"
[4] "TL XF - maskedfiletitle.txt"
[5] "UR FG - maskedfiletitle.txt"
[6] "VB PD - maskedfiletitle.txt"
[7] "VS KP - maskedfiletitle.txt"
The desired output is the following:
[1] "TF DD - maskedfiletitle.txt"
[2] "TL XF - maskedfiletitle.txt"
[3] "UR FG - maskedfiletitle.txt"
[4] "VB PD - maskedfiletitle.txt"
[5] "VS KP - maskedfiletitle.txt"
It seems to always return the first and last file in the folder with the first two characters replaced by "~$". Obviously, if the next step is to read these files it will give an error message saying that the "~$" file does not exist.
For now I have worked around this by simply removing the first two elements. However, I have no answer as to why this behaviour occurs.
I have tried removing all non .txt files from the folder and rewriting the function to use different arguments:
> list.files(all.files = FALSE, no.. = TRUE)
[1] "~$ DD - maskedfiletitle.txt"
[2] "~$ KP - maskedfiletitle.txt"
[3] "TF DD - maskedfiletitle.txt"
[4] "TL XF - maskedfiletitle.txt"
[5] "UR FG - maskedfiletitle.txt"
[6] "VB PD - maskedfiletitle.txt"
[7] "VS KP - maskedfiletitle.txt"
This, however, also gives me the first and last file with the first two characters changed to "~$".
Now, it's not a critical error or anything, but I'm interested in learning where this behaviour comes from. I have read through the help section of the function and I've searched a bit on the web but I cannot find anything that explains it and I am quite stumped.
Do let me know if I need to provide more information!
Usually any file that starts with ~$
is a temporary file (at least on Windows). If any of your .txt files are currently open, try closing them first.
Otherwise, I thought you might be able to combine the files that end with .txt
and files that don't start with ~$
conditions into one regex to use in the pattern
argument (using "^(?!~\\$).*\\.txt$"
). However, the pattern
argument in list.files
doesn't support negative lookahead directly, so you need to do it in two steps:
.txt
(as you've already done)~$
my_files <- list.files(pattern = "\\.txt$")
txt_files <- my_files[!grepl("^~\\$", my_files)]