linuxdiffrsyncdu

linux - after rsync, du shows size difference when diff does not


I copied a large folder from NTFS to ext4 using 'rsync' and validating it with 'diff'. Just for the shake of curiosity, I also used 'du' command to check if folders had the same size. While 'diff' didn't show any difference, 'du' showed that folders had different sizes. I did not encounter any errors while executing the following commands.

rsync --archive --recursive "$src" "$dest" 2>rsync_error.txt

sync

diff --brief --recursive --new-file "$src" "$dest" 1>diff-log.txt 2>diff-error.txt

Then I used 'du' for each folder:

du -sb "$src"
du -sb "$dest"
Output:
137197597476
137203512004

1.Why would this happen since there is not any difference?

2.Should I be worried about my data or my system?

EDIT: I also tried du -s --apparent-size and there is still difference.


Solution

  • Sparses files

    Under linux, you could create so-called sparse files. They are files where full NULL block don't really exists!

    Try this:

    $ dd if=/dev/zero count=2048 of=normalfile
    2048+0 records in
    2048+0 records out
    1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0103269 s, 102 MB/s
    

    and

    $ dd if=/dev/zero count=0 seek=2048 of=sparsefile
    0+0 records in
    0+0 records out
    0 bytes copied, 0.000182708 s, 0.0 kB/s
    

    then

    $ ls -l sparsefile normalfile
    -rw-r--r-- 1 user  user  1048576 Feb  3 17:53 normalfile
    -rw-r--r-- 1 user  user  1048576 Feb  3 17:53 sparsefile
    
    $ du -b sparsefile normalfile
    1048576     sparsefile
    1048576     normalfile
    

    but

    $ du -k sparsefile normalfile
    0   sparsefile
    1024        normalfile
    
    $ du -h sparsefile normalfile
    0   sparsefile
    1.0M        normalfile
    

    So long block in sparsefile are not used, they will not be allocated!

    $ du -k --apparent-size sparsefile normalfile
    1024        sparsefile
    1024        normalfile
    

    Then

    $ diff sparsefile normalfile
    echo $?
    0
    

    There is virtually no difference between both files!

    Further

    $ /sbin/mkfs.ext4 sparsefile 
    mke2fs 1.44.5 (15-Dec-2018)
    Filesystem too small for a journal
    ...
    Writing superblocks and filesystem accounting information: done
    
    $ ls -l sparsefile normalfile 
    -rw-r--r-- 1 user  user  1048576 Feb  3 17:53 normalfile
    -rw-r--r-- 1 user  user  1048576 Feb  3 17:59 sparsefile
    
    $ du -k sparsefile 
    32  sparsefile
    
    $ diff sparsefile normalfile
    Binary files sparsefile and normalfile differ