I have a ruby script that runs waiting for a scanned pdf document to be uploaded to a path and then does some additional processing and notifications. The scanner is a Canon ImageRunner scanner using scan to an ftp path, the server runs a vsftpd
with a very simple config.
require 'rb-inotify'
notifier = INotify::Notifier.new
notifier.watch("/root/test", :close_write) do |event|
p event.absolute_name
p event.name
end
notifier.run
Usually everything works as expected and we get back the file name & full path as expected.
"/root/test/doc1.pdf"
"doc1.pdf"
"/root/test/doc2.pdf"
"doc2.pdf"
But we have a new scanner device and when we scan a document from that machine the initial file names are reported with an additional ~
(tilde) appended to the end of the name (which breaks lots of things elsewhere in the processing script)
"/root/test/doc1.pdf~"
"doc1.pdf~"
"/root/test/doc2.pdf~"
"doc2.pdf~"
But when I check the actual folder the files are stored with the correct names and without the ~
.
I'm guessing this is some weird issue with the way the new scanner actually writes the file to the server.
As a very quick and dirty hack I did a test removing the trailing ~
with [0...-1]
from the path which superficially works, but breaks others things later in the bigger processing script.
What is the safest way I can work around this issue so that I only ever get the correct file name reported? (Just removing the last char isn't practical as manual uploads along with other scanners that might be used all return the correct filename, and in my dirty hack would end up as doc1.pd
)
As suspected this is some artefact of the newer scanner hardware, the older units and the MFD's used for test didn't have this issue.
based on engineersmnky comment I ended up with the following;
notifier.watch("/root/test/", :close_write) do |event|
file_path = event.absolute_name[/(.*[^~])/]
file_name = event.name[/(.*[^~])/]
puts "#{file_path} is now uploaded"
sleep(1)
copy_file(file_path, file_name)
end
The [/(.*[^~])/]
strips the errant ~
from the file name, and the sleep(1)
is enough for the file-system to catch up and the rest of the script runs as expected.