pythonshutilcopytree

The Ignore callback for python shutil.copytree() does not accept full path


I'd like to specify full paths to ignorable files and directories when calling shutil.copytree(). Something like

def my_ignore(dir, files):

    # return ["exclude.file"] # working

    return ["/full_path_to/exclude.file"] # Not working

shutil.copytree(src, dest, ignore=my_ignore)

After this, the excluded file is still there unless I return simply the filename instead of full path. The thing is I really want to set up a particular file instead of all matching filenames under different directories.

I referred to a number of questions here, such as: How to write a call back function for ignore in shutil.copytree

Filter directory when using shutil.copytree?

But none of the answers work. It looks like the ignore hook can only return a glob-style and any constructed full path will not work.

Am I missing something?


Solution

  • ignore indeed must return just the filenames that are ignored. However, the function is called for each directory shutil.copytree() visits; you get to ignore files per directory.

    If you have a full path to a file you need to ignore, then match against the first parameter passed to your ignore function; it is the full path to that directory:

    def my_ignore(dir, files):
        if dir == '/full_path_to':
            return {"exclude.file"}
    

    I return a set here; set membership testing is faster than with a list.

    If you have a predefined set of paths to ignore, parse those out into a dictionary; keys are the directory path, values sets of filenames in that path:

    from collections import defaultdict
    
    to_ignore = defaultdict(set)
    for path in ignored_paths:
        dirname, filename = os.path.split(path)
        to_ignore[dirname].add(filename)
    
    def my_ignore(src, files):
        return to_ignore.get(src, set())