bashduplicatessubdirectory

Find duplicated nested directories


I have a large directory tree with this nested directories duplicates (but not all):

How can I merge only duplicated directories with this actions:

My current code:

#/bin/bash

for folder in $(find httpdocs -type d); do
    n=$(echo $folder | tr "/" "\n" | wc -l)
    nuniq=$(echo $folder | tr "/" "\n" | sort | uniq | wc -l)

    [ $n -eq $nuniq ] || echo "Duplicated folder $folder"
done

But have a problem, because data/home/es/home is a valid folder, but detected as duplicated.

Thanks.


Solution

  • You can use uniq command like this:

    #/bin/bash
    
    for folder in $(find httpdocs -type d); do
        nuniq=$(echo $folder | tr "/" "\n"  | uniq -d | wc -l)
         if [ "$nuniq" -gt "0" ]
          then
            echo "Duplicated folder $folder"
          fi
    done
    

    man uniq;

      -d, --repeated
              only print duplicate lines
    

    You can try the following script to copy and delete folders. I did not test this, so backup your httpdocs folder before running this.

    #/bin/bash
    
    for folder in $(find httpdocs -type d); do
        nuniq=$(echo $folder | tr "/" "\n"  | uniq -d | wc -l)
         if [ "$nuniq" -gt "0" ]
          then
            dest=$(echo $folder | tr '/' '\n' | awk '!a[$0]++' | tr '\n' '/')
            mv -i $folder/*  $dest
            rmdir $folder 
          fi
    done
    

    For example:

    user@host $ echo "data/home/es/home" | tr "/" "\n"  
    data
    home
    es
    home
    
    user@host $ echo "data/home/es/home" | tr "/" "\n"  | uniq -d | wc -l 
    0
    
    user@host $ echo "data/home/home" | tr "/" "\n"  
    data
    home
    home
    
    user@host $ echo "data/home/home" | tr "/" "\n" | uniq -d 
    home
    
    user@host $ echo "data/home/home" | tr "/" "\n" | uniq -d | wc -l
    1