Someone on our server ran sed -i 's/$var >> $var2/$var > $var2/ *
to change inserts to overwrites in some bash scripts in a common directory. No big deal, it was tested first with grep
and it returned the expected results that only his files would be touched.
He ran the script and now 1200 files of the 1400 in the folder have a new modified date, yet as far as we can tell, only his small handful of files were actually changed.
$
's in the sed regex)?When GNU sed
successfully edits a file "in-place," its timestamp is updated. To understand why, let's review how edit "in-place" is done:
A temporary file is created to hold the output.
sed
processes the input file, sending output to the temporary file.
If a backup file extension was specified, the input file is renamed to the backup file.
Whether a backup is created or not, the temporary output is moved (rename
) to the input file.
GNU sed
does not track whether any changes were made to the file. Whatever is in the temporary output file is moved to the input file via rename
.
There is a nice benefit to this procedure: POSIX requires that rename
be atomic. Consequently, the input file is never in a mangled state: it is either the original file or the modified file and never part way in-between.
As a result of this procedure, any file that sed
successfully processes will have its timestamp changed.
Let's consider this inputfile
:
$ cat inputfile
this is
a test.
Now, under the supervision of strace
, let's run sed -i
on it in a way guaranteed to cause no changes:
$ strace sed -i 's/XXX/YYY/' inputfile
The edited result looks like:
execve("/bin/sed", ["sed", "-i", "s/XXX/YYY/", "inputfile"], [/* 55 vars */]) = 0
[...snip...]
open("inputfile", O_RDONLY) = 4
[...snip...]
open("./sediWWqLI", O_RDWR|O_CREAT|O_EXCL, 0600) = 6
[...snip...]
read(4, "this is\na test.\n", 4096) = 16
write(6, "this is\n", 8) = 8
write(6, "a test.\n", 8) = 8
read(4, "", 4096) = 0
[...snip...]
close(4) = 0
[...snip...]
close(6) = 0
[...snip...]
rename("./sediWWqLI", "inputfile") = 0
As you can see, sed
opens the input file, inputfile
, on file handle 4. It then creates a temporary file, ./sediWWqLI
on file handle 6, to hold the output. It reads from the input file and writes it unchanged to the output file. When this is done, a call to rename
is made to overwrite inputfile
, changing its timestamp.
sed
source codeThe relevant source code is in the execute.c
file of the sed
directory of the source. From version 4.2.1:
ck_fclose (input->fp);
ck_fclose (output_file.fp);
if (strcmp(in_place_extension, "*") != 0)
{
char *backup_file_name = get_backup_file_name(target_name);
ck_rename (target_name, backup_file_name, input->out_file_name);
free (backup_file_name);
}
ck_rename (input->out_file_name, target_name, input->out_file_name);
free (input->out_file_name);
ck_rename
is a cover function for the stdio function rename
. The source for ck_rename
is in sed/utils.c
.
As you can see, no flag is kept to determine whether the file actually changed or not. rename
is called regardless.
As for the 200 of the 1400 files whose timestamps did not change, that would mean that sed
somehow failed on those files. One possibility would be a permissions issue.
sed -i
and Symbolic LinksAs noted by mklement0, applying sed -i
to a symbolic link leads to a surprising result. sed -i
does not update the file pointed to by the symbolic link. Instead, sed -i
overwrites the symbolic link with a new regular file.
This is a result of the call that sed
makes to the STDIO rename
. As documented by man 2 rename
:
if newpath refers to a symbolic link the link will be overwritten.
mklement0 reports that this is also true of the (BSD) sed
on Mac OSX 10.10.