pythonfilepython-3.8

Python writing to file from multiple separate Python processes, do I have to lock file?


I have a Python script that is running in a number of different Kubernetes pods at all times (minimum 1, max ~100 at the same time).

These processes are largely independent of each other, except that at one point, they have to write to the same file (last_appended.txt) in the following fashion:

with open(filepath, 'w') as file:
    file.write(str(int(time.timestamp())))

I am wondering if I have to do any sort of locking on this file or if this is such a minimal operation that this is not necessary?

If I would want to lock the file, I have found the following code to enable this:

with open(filepath, 'w') as file:
        fcntl.flock(file, fcntl.LOCK_EX)
        file.write(str(int(time.timestamp())))
        fcntl.flock(file, fcntl.LOCK_UN)

However, I wonder if this is enough to let my processes run smoothly, or that I maybe have to write some sort of try/except loop when a lock is encountered by a process.

Summarising, my question is two-fold:

  1. Would I have to lock the file at all, or is this such a small operation that it can be done by multiple processes without crashing, and
  2. If no, would my solution to the multiple processes writing to the file be sufficient?

Solution

  • Unrelated: using w mode in this context is weird, do not you mean a mode here?


    As you are using fcntl I shall assume a Unix-like system here

    If you do not use locks, you have what is called a race condition. It means that under small load, the risk of problem is close to 0, but it could increase under higher load. This us something that sysadmins hate, because it leads to non reproducible problems.

    A lock costs indeed some resources, but under normal load (where the non locking version would not experience any problems), there would be no contention on that lock, so it should not be noticeable. Under heavy load, it would prevent garbled messages if two processes tried to write at the same time.

    If you want to prevent contention under heavy load, you could wait with a short timeout. It is easy on a number of systems or requires explicitely calling alarm on some others. If the lock could be acquired, just proceed with writing to the file. Else, skip that writing and if possible log (elsewhere) the error condition for later analysis.