I have a directory with files. The archive is very big and has 1.5 million pdf files inside.
the directory is stored on an IBM i server with OS V7R1 and the machine is new and very fast.
The files are named like this :
invoice_[custno]_[year']_[invoice_number].pdf
invoice_081500_2013_7534435564.pdf
No I try to find files with the find command using the Shell.
find . -name 'invoice_2013_*.pdf' -type f | ls -l > log.dat
The command took a long time so I aborted the operation with no result.
If I try it with smaller directories all works fine.
Later I want to have a job that runs everey day and finds the files created the last 24 hours but I it aleays runs so slow I can forget this.
That invocation would never work because ls
does not read filenames from stdin
.
Possible solutions are:
Use the find
utility's built-in list option:
find . -name 'invoice_2013_*.pdf' -type f -ls > log.dat
Use the find
utility's -exec
option to execute ls -l
for each matching file:
find . -name 'invoice_2013_*.pdf' -type f -exec ls {} \; > log.dat
Pipe the filenames to the xargs
utility and let it execute ls -l
with the filenames as parameters:
find . -name 'invoice_2013_*.pdf' -type f | xargs ls -l > log.dat
A pattern search of 1.5 million files in a single directory is going to be inefficient on any filesystem.