pythonpython-3.6filemtime

How does os.stat(path).st_mtime actually get file modified time?


In this example, I'm using Python 3.6.5 installed using pyenv in an OSX shell.

I've been toying around with some proof of concept file watching code and I figured using a delta of a file's current and last measured st_mtime would be enough to "detect" that a file has changed.

The code:

import os


def main():
    file_path = 'myfile.txt'
    last_modified = os.stat(file_path).st_mtime
    while True:
        check_last_modified = os.stat(file_path).st_mtime
        delta = check_last_modified - last_modified

        if delta != 0.0:
            print("File was modified.")

        last_modified = check_last_modified



if __name__ == '__main__':
    main()

The weird thing is different types of basic file modification operations will result in "File was modified." printing more than once.

Assuming myfile.txt exists, I get a different number of prints based on the operation:

It prints 1 time with: $ touch myfile.txt

It prints 2 times with: $ echo "" > myfile.txt.

It prints 1 time with:

$ cat <<EOF > myfile.txt
> EOF

It prints 2 times with (empty line):

$ cat <<EOF > myfile.txt
>
> EOF

It prints 1 time using python to write an empty string:

def main():
    with open('myfile.txt', 'w') as _file:
        _file.write('')

if __name__ == '__main__':
    main()

It prints 2 times using python to write a non-empty string:

def main():
    with open('myfile.txt', 'w') as _file:
        _file.write('a')

if __name__ == '__main__':
    main()

The biggest difference seems to be the presence of a string other than a newline, but seeing as how the echo command results in two prints I'm not inclined to believe it's bound to that in any way.

Any ideas?


Solution

  • Your loop is a busy waiting loop so it can catch several time changes very quickly.

    When python creates the file (open) it sets/updates the creation time.

    But the creation time is updated once more when closing the file. Which explains you catch 2 time updates.

    touch just sets the modification time once, but echo acts the same as your python script: set modification time when creating/opening the existing file, and set it again when closing it.

    The busy loop and the open/close operations create a race conditions and the number of time updates you're seeing is undefined (which explains that your script misses one update in a cat command where the data is small)