linuxoperating-systeminode

Copy and move's command effect on inode


I interpret inode as a pointer to the actual place where the file is stored.

But I have problem understanding:

  1. If I use cp file1 file2 in a place where file2 already exists, the inode doesn't change. And If there is originally a hard-link to file2, they now both point to the new file just copied here.

    • The only reason I can think of is that Linux interprets this as modifying the file instead of deleting and creating a new file. I don't understand why it's designed this way?
  2. But when I use mv file1 file2, the inode changes to the inode of file1.


Solution

  • You are correct in stating that cp will modify the file instead of deleting and recreating.

    Here is a view of the underlying system calls as seen by strace (part of the output of strace cp file1 file2):

    open("file2", O_WRONLY|O_TRUNC)         = 4
    stat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
    stat("file1", {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
    stat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
    open("file1", O_RDONLY)                 = 3
    fstat(3, {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
    open("file2", O_WRONLY|O_TRUNC)         = 4
    fstat(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
    fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
    read(3, "hi\n", 65536)                  = 3
    write(4, "hi\n", 3)                     = 3
    read(3, "", 65536)                      = 0
    close(4)                                = 0
    close(3)                                = 0
    

    As you can see, it detects that file2 is present (stat returns 0), but then opens it for writing (O_WRONLY|O_TRUNC) without first doing an unlink.

    See for example POSIX.1-2017, which specifies that the destination file shall only be unlink-ed where it could not be opened for writing and -f is used:

    A file descriptor for dest_file shall be obtained by performing actions equivalent to the open() function defined in the System Interfaces volume of POSIX.1-2017 called using dest_file as the path argument, and the bitwise-inclusive OR of O_WRONLY and O_TRUNC as the oflag argument.

    If the attempt to obtain a file descriptor fails and the -f option is in effect, cp shall attempt to remove the file by performing actions equivalent to the unlink() function defined in the System Interfaces volume of POSIX.1-2017 called using dest_file as the path argument. If this attempt succeeds, cp shall continue with step 3b.

    This implies that if the destination file exists, the copy will succeed (without resorting to -f behaviour) if the cp process has write permission on it (not necessarily run as the user that owns the file), even if it does not have write permission on the containing directory. By contrast, unlinking and recreating would require write permission on the directory. I would speculate that this is behind the reason why the standard is as it is.

    The --remove-destination option on GNU cp will make it do instead what you thought ought to be the default.

    Here is the relevant part of the output of strace cp --remove-destination file1 file2. Note the unlink this time.

    stat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
    stat("file1", {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
    lstat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
    unlink("file2")                         = 0
    open("file1", O_RDONLY)                 = 3
    fstat(3, {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
    open("file2", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4
    fstat(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
    fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
    read(3, "hi\n", 65536)                  = 3
    write(4, "hi\n", 3)                     = 3
    read(3, "", 65536)                      = 0
    close(4)                                = 0
    close(3)                                = 0
    

    When you use mv and the source and destination paths are on the same file filesystem, it will do an rename, and this will have the effect of unlinking any existing file at the target path. Here is the relevant part of the output of strace mv file1 file2.

    access("file2", W_OK)                   = 0
    rename("file1", "file2")                = 0
    

    In either case where an destination path is unlinked (whether explicitly by unlink() as called from cp --remove-destination, or as part of the effect of rename() as called from mv), the link count of the inode to which it was pointing will be decremented, but it will remain on the filesystem if either the link count is still >0 or if any processes have open filehandles on it. Any other (hard) links to this inode (i.e. other directory entries for it) will remain.

    Investigating using ls -i

    ls -i will show the inode numbers (as the first column when combined with -l), which helps demonstrate what is happening.

    Example with default cp action

    $ rm file1 file2 file3 
    
    $ echo hi > file1
    $ echo world > file2
    $ ln file2 file3
    
    $ ls -li file*
    49 -rw-rw-r-- 1 myuser mygroup    3 Jun 13 10:43 file1
    50 -rw-rw-r-- 2 myuser mygroup    6 Jun 13 10:43 file2
    50 -rw-rw-r-- 2 myuser mygroup    6 Jun 13 10:43 file3
    
    $ cp file1 file2 
    $ ls -li file*
    49 -rw-rw-r-- 1 myuser mygroup    3 Jun 13 10:43 file1
    50 -rw-rw-r-- 2 myuser mygroup    3 Jun 13 10:43 file2   <=== exsting inode
    50 -rw-rw-r-- 2 myuser mygroup    3 Jun 13 10:43 file3   <=== exsting inode
    

    (Note existing inode 50 now has size 3).

    Example with --remove-destination

    $ rm file1 file2 file3
    $ echo hi > file1
    $ echo world > file2
    $ ln file2 file3
    
    $ ls -li file*
    49 -rw-rw-r-- 1 myuser mygroup    3 Jun 13 10:46 file1
    50 -rw-rw-r-- 2 myuser mygroup    6 Jun 13 10:46 file2
    50 -rw-rw-r-- 2 myuser mygroup    6 Jun 13 10:46 file3
    
    $ cp --remove-destination file1 file2
    $ ls -li file*
    49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:46 file1
    55 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:47 file2   <=== new inode
    50 -rw-rw-r-- 1 myuser mygroup 6 Jun 13 10:46 file3   <=== existing inode
    

    (Note new inode 55 has size 3. Unmodified inode 50 still has size 6.)

    Example with mv

    $ rm file1 file2 file3
    $ echo hi > file1
    $ echo world > file2
    $ ln file2 file3
    
    $ ls -li file*
    49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 11:05 file1
    50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 11:05 file2
    50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 11:05 file3
    
    $ mv file1 file2 
    $ ls -li file*
    49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 11:05 file2  <== existing inode
    50 -rw-rw-r-- 1 myuser mygroup 6 Jun 13 11:05 file3  <== existing inode