First I have video files that record from webcam camera. It will got many file of videos but I want to delete duplicate file base on modification time, limited by minutes.
For example, I have 3 video files as below. base on (hour : minute : second)
I want to get remains output.
Now I have code for find modification time as below.
import os
import datetime
import glob
from datetime import datetime
for file in glob.glob('C:\\Users\\xxx\\*.AVI'):
time_mod = os.path.getmtime(file)
print (datetime.fromtimestamp(time_mod).strftime('%Y-%m-%d %H:%M:%S'),'-->',file)
Please supporting me to adapt my code for delete duplicate file based on modified time, limited by minutes.
Here is my suggested solution. See the comments in the code itself for an detailed explanation, but the basic idea is that you build up a nested dictionary of lists of 2-element tuples, where the keys of the dictionary are the number of minutes since the start of Unix time, and the 2-tuples contain the filename and the remaining seconds. You then loop over the values of the dictionary (lists of tuples for files created within the same calendar minute), sort these by the seconds, and delete all except the first.
The use of a defaultdict
here is just a convenience to avoid the need to explicitly add new lists to the dictionary when looping over files, because these will be added automatically when needed.
import os
import glob
from collections import defaultdict
files_by_minute = defaultdict(list)
# group together all the files according to the number of minutes since the
# start of Unix time, storing the filename and the number of remaining seconds
for filename in glob.glob("C:\\Users\\xxx\\*.AVI"):
time_mod = os.path.getmtime(filename)
mins = time_mod // 60
secs = time_mod % 60
files_by_minute[mins].append((filename, secs))
# go through each of these lists of files, removing the newer ones if
# there is more than one
for fileset in files_by_minute.values():
if len(fileset) > 1:
# sort tuples by second element (i.e. the seconds)
fileset.sort(key=lambda t:t[1])
# remove all except the first
for file_info in fileset[1:]:
filename = file_info[0]
print(f"removing {filename}")
os.remove(filename)