pythonpython-3.xcopyfile-search

Checking if file to be copied already exists in specified directory and if so skip the file and move onto next


I am iterating through a master directory with numerous sub-directories each containing their own sub-directories. I am looking to copy files of extension type .xlsx from the master directory to a new directory to collate all the files in a single locations. Each file has a unique name with new files being added daily.

Once a file is copied to the new directory I would like the script to prevent it from being over-written by comparing file names based on what is already contained within the master directory eg:

Master directory today contains test1.xlsx and test2.xlsx which is copied to the new directory I specified.

2 Days later the master directory contains test1.xlsx, test 2.xlsx and test 3.xlsx. In this instance once I execute the code, I would like to iterate through the master directory and sub dirs and identify that only test 3.xlsx is new based on a comparison between the file search in the master directory and the specified directory where I copy the files to.

Apologies new to StackOverFlow and Python with English being a second language so not too sure if I explained it too well but hopefully someone will get the gist.

I have tried the following code but it keeps overwriting my files in my specified directory where I wish to copy the found .xlsx files to.

import os
import shutil
from os.path import isfile

#count = 0

for root, dirs, files in os.walk('Checklists'):
    for file in files:
       if file.endswith('.xlsx'):
        #print(file)
        if isfile('Checklist'):
            print("File exists")
        else:
        #print(os.path.join(root, file))
        #count +=1
        #print(count)
        #if not os.path.exists(os.path.join('Checklists', file)):
            shutil.copy(os.path.abspath(root + '/' + file), 'Checklist', follow_symlinks=True)

Solution

  • I managed to solve this riddle myself with the following addition:

    for root, dirs, files in os.walk('Checklists'):
    for file in files:
       if file.endswith('.xlsx'):
        #print(file)
        **if os.path.exists('Checklist'):**
            pass
            print("File exists")
        else:
        #print(os.path.join(root, file))
        #count +=1
        #print(count)
        #if not os.path.exists(os.path.join('Checklists', file)):
            shutil.copy(os.path.abspath(root + '/' + file), 'Checklist', follow_symlinks=True)
    

    Thanks to the folks who took the time out to look at my question at least