Scenario: I have many processes running that need to fetch files over the net. If the file is already downloaded, I want it cached on disk. If another process is downloading the file, block until it is finished downloading.
I've been trying to find the easiest way to do to this. The obvious way is to:
create file w/ an exclusive lock active on it only if it doesn't exist (O_CREAT | O_EXCL)
if file exists already:
open file and acquire exclusive lock
else:
download to newly created file
release lock
This system accomplishes the above goals with (seemingly) no race conditions
Unfortunately, I couldn't find documentation on how to use open(), etc. to create a file that is locked in Linux. If I split the create step into:
open w/ O_CREAT | O_EXCL
flock
a race condition now exists between the create and lock (non-creating process acquires the lock before the creator does).
I realize I could use an external lock file per file (e.g. filename + '.lock), which I acquire before attempting to create filename, but this feels.. inelegant (and I need to now worry about how to files that actually have a .lock suffix!)
Is there anyway to atomically create and lock it (as Windows offers) or is the external lockfile method pretty much what is standard/required?
The race exists anyway. If the file may or may not exist then you have to test for its existence before trying to lock it. But if the file is your mutex, then you can't possibly do that and the space between "if file exists already" (false) and "download to newly created file" is unconstrained. Another process could come by and create the file and start downloading before your download begins, and you would clobber it.
Basically don't use fcntl locks here, use the existence of the file itself. open()
with O_CREAT and O_EXCL will fail if the file already exists, telling you that someone else got there first.