Given a directory, how do I find all files within it (and any sub-directories) that are not hard-linked files? Or more specifically, that are not hard-linked files with more than one reference?
Basically I want to scan a folder and return a list of unique files within that directory, including directories and symbolic links (not their targets). If possible, it'd be nice to also ignore hard-linked directories on file-systems that support them (such as HFS+).
Hard-linked files have the same inode. You can use stat
to print the inode and the filename, and use awk
to print the file only for the first time that inode appears:
stat -c '%i %n' *csv | awk '!seen[$1]++' | cut -d ' ' -f 2-