I've got a situation where I watch a specific directory for filesystem changes. If a certain file in that directory is changed, I re-read it, attach some existing cached information, and store it in an atom
.
The relevant code looks like
(def posts (atom []))
(defn load-posts! []
(swap!
posts
(fn [old]
(vec
(map #(let [raw (json/parse-string % (fn [k] (keyword (.toLowerCase k))))]
(<snip some processing of raw, including getting some pieces from old>))
(line-seq (io/reader "watched.json")))))))
;; elsewhere, inside of -main
(watch/start-watch
[{:path "resources/"
:event-types [:modify]
:callback (fn [event filename]
(when (and (= :modify event) (= "watched.json" filename))
(println "Reloading posts.json ...")
(posts/load-posts!)))}
...])
This ends up working fine locally, but when I deploy it to my server, the swap!
call hangs about half-way through.
I've tried debugging it via println
, which told me
swap!
is not running the function more than once111
(which doesn't seem to be significantly different from any preceding entries).atom
is therefore preservedI suspect that this is either a memory issue somewhere, or possibly a bug in Clojure-Watch (or the underlying FS-watching library).
Any ideas how I might go about fixing it or diagnosing it further?
The hang is caused by an error being thrown inside of the function passed as a :callback
to watch/start
.
The root cause in this case is that the modified file is being copied to the server by scp
(which is not atomic, and the first event therefore triggers before the copy is complete, which is what causes the JSON parse error to be thrown).
This is exacerbated by the fact that watch/start
fails silently if its :callback
throws any kind of error.
The solutions here are
Use rsync
to copy files. It does copy atomically but it will not generate any :modify
events on the target file, only related temp-files. Because of the way its atomic copy works, it will only signal :create
events.
Wrap the :callback
in a try
/catch
, and have the catch
clause return the old value of the atom. This will cause load-posts!
to run multiple times, but the last time will be on file copy completion, which should finally do the right thing.
(I've done both, but either would have realistically solved the problem).
A third option would be using an FS-watching library that reports errors, such as Hawk or dirwatch (or possibly hara.io.watch? I haven't used any of these, so I can't comment).
Diagnosing this involved wrapping the :callback
body with
(try
<body>
(catch Exception e
(println "ERROR IN SWAP!" e)
old))
to see what was actually being thrown. Once that printed a JSON parsing error, it was pretty easy to gain a theory of what was going wrong.