What's the best scripted way to delete (near) duplicate files based on filespec in Windows (XP in this case)? I am thinking of RegEX and some VB Script but if there is a better way...
Examples include filenames that slighlty differ in name either with one or two (known) extra characters at the end or beggining but are identical in size, files that are slighlty different in size as well..etc
Is Regex the best way to handle these variances if the boundaries are known.
No, I don't think regex is the right tool here. It sounds a bit dangerous, if you ask me. Anyway, you could calculate the Levenshtein distance between the two file names and if sufficiently small (be careful with file names that consist of just a couple of characters!) delete one of the two.
The sizes can be done using simple arithmetic.