pythonfilesystemswatchfilemtime

Detect if a command modifies any files from a directory within a Python script


I have a Python script:

import subprocess

subprocess.run(['black', 'src'])

I would like to tell if the command run by subprocess modified any file in the folder src - so, I'd like my script to look like this:

import subprocess

subprocess.run(['black', 'src'])
mutated = <???>

How can I detect whether any file in the directory src changed after the subprocess.run command, and if so assign True to mutated?

EDIT

Using os.path.getmtime isn't working for me:

(Pdb) os.path.getmtime(str(arg))
1596263725.3222768
(Pdb) subprocess.run(['black', str(arg), '--line-length=5'])
reformatted /tmp/tmp7e7suv4e/tests/data/clean_notebook   .py
reformatted /tmp/tmp7e7suv4e/tests/data/notebook_for_testing   .py
reformatted /tmp/tmp7e7suv4e/tests/data/notebook_for_testing_copy   .py
reformatted /tmp/tmp7e7suv4e/tests/data/notebook_starting_with_md   .py
All done! ✨ 🍰 ✨
4 files reformatted, 2 files left unchanged.
CompletedProcess(args=['black', '/tmp/tmp7e7suv4e/tests', '--line-length=5'], returncode=0)
(Pdb) os.path.getmtime(str(arg))
1596263725.3222768

Solution

  • Not the most reliable approach, but you can get the system time immediately before running the subprocess, then compare it to the modification time of the folder.

    from time import time
    from os.path import getmtime
    
    before = time()
    # Run subprocess
    mutated = getmtime('src') > before
    

    This approach is a bit unreliable, for example if your system clock gets reset between reboots or something. A better way would be to compare modification times on the folder:

    from os.path import getmtime
    
    before = getmtime('src')
    # Run subprocess
    mutated = getmtime('src') != before
    

    This works because on normal file systems, "modifying" a file usually involves rewriting it, which means updating the directory entry for it, which in turn means that the directory itself is modified. An example of a program that does not do that is touch. If you run into a program that does not do it that way, you can always check the modification times of the individual files in the folder in the same way:

    from os import listdir
    from os.path import join, getmtime
    
    def mtimes(path):
        return {fname: getmtime(join(path, fname)) for fname in os.listdir(path)}
    
    before = mtimes('src')
    # Run subprocess
    mutated = mtimes('src') == before
    

    Using == on dicts automatically checks that all keys are equal (i.e., if files were added or deleted), and that all corresponding modification time values are equal.

    It is conceivable that you will get some false positives this way if another process accesses the folder, but virtually impossible to get false negatives, unless someone explicitly messes with the modification times.