I am trying to learn python-watchdog, but I am sort of confused why the job I set up runs more than once. So, here is my set up:
#handler.py
import os
from watchdog.events import FileSystemEventHandler
from actions import run_something
def getext(filename):
return os.path.splitext(filename)[-1].lower()
class ChangeHandler(FileSystemEventHandler):
def on_any_event(self, event):
if event.is_directory:
return
if getext(event.src_path) == '.done':
run_something()
else:
print "event not directory.. exiting..."
pass
the observer is set up like so:
#observer.py
import os
import time
from watchdog.observers import Observer
from handler import ChangeHandler
BASEDIR = "/path/to/some/directory/bin"
def main():
while 1:
event_handler = ChangeHandler()
observer = Observer()
observer.schedule(event_handler, BASEDIR, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
if __name__ == '__main__':
main()
and finally, the actions like so:
#actions.py
import os
import subprocess
def run_something():
output = subprocess.check_output(['./run.sh'])
print output
return None
..where ./run.sh
is just a shell script I would like to run when a file with an extension .done
is found on /path/to/some/directory/bin
#run.sh
#!/bin/bash
echo "Job Start: $(date)"
rm -rf /path/to/some/directory/bin/job.done # remove the .done file
echo "Job Done: $(date)"
However, when I issue a python observer.py
and then do a touch job.done
on /path/to/some/directory/bin
, I see that my shell script ./run.sh
runs three times and not one..
I am confused why this runs thrice and not just once (I do delete the job.done
file on my bash script)
To debug watchdog scripts, it is useful to print what watchdog is seeing as events. One file edit or CLI command, such as touch
, can result in multiple watchdog events. For example, if you insert a print statement:
class ChangeHandler(FileSystemEventHandler):
def on_any_event(self, event):
print(event)
to log every event, running
% touch job.done
generates
2014-12-24 13:11:02 - <FileCreatedEvent: src_path='/home/unutbu/tmp/job.done'>
2014-12-24 13:11:02 - <DirModifiedEvent: src_path='/home/unutbu/tmp'>
2014-12-24 13:11:02 - <FileModifiedEvent: src_path='/home/unutbu/tmp/job.done'>
Above there were two events with src_path
ending in job.done
. Thus,
if getext(event.src_path) == '.done':
run_something()
runs twice because there is a FileCreatedEvent
and a FileModifiedEvent
.
You might be better off only monitoring FileModifiedEvent
s.