I have two files:
$ cat xx
aaa
bbb
ccc
ddd
eee
$ cat zz
aaa
bbb
ccc
#ddd
eee
I want to diff them, while ignoring comments.
I tried all possible permutations, but nothing works:
diff --ignore-matching-lines='#' -u xx zz
diff --ignore-matching-lines='#.*' -u xx zz
diff --ignore-matching-lines='^#.*' -u xx zz
how can I diff two files, while ignoring given regex, such as anything starting with #
?
That not how the -I
option in diff works, see this Giles's comment on Unix.SE and also on the man page - 1.4 Suppressing Differences Whose Lines All Match a Regular Expression
In short, the -I
option works, if all the differences (insertions/deletions or changes) between the files match the RE defined. In your case, the diff between your two files, as seen in the output
diff f1 f2
4c4
< ddd
---
> #ddd
i.e. 4th line change in both the files, ddd
and #ddd
are the "hunks" as defined in the man page, together don't match any of your REs #
, #.*
or ^#.*
. So when such an indifference exists, the action will be to print both the matching and the non-matching lines. Quoting the manual,
for each nonignorable change, diff prints the complete set of changes in its vicinity, including the ignorable ones.
The same would have worked better, if the file f1 did not contain the line ddd
, i.e.
f1
aaa
bbb
ccc
eee
f2
aaa
bbb
ccc
#ddd
eee
where doing
diff f1 f2
3a4
> #ddd
would result in just one "hunk", #ddd
which can be marked for ignoring with a pattern like ^#
i.e. ignore any lines starting with a #
, as you can see will produce the desired output (no lines)
diff -u -I '^#' f1 f2
So given your input contains the uncommented line ddd
in f1, it will be not straightforward to define an RE to match a commented and an uncommented line. But diff
does support including multiple -I
flags as
diff -I '^#' -I 'ddd' f1 f2
but that cannot be valid, as you cannot know the exclude pattern beforehand to include in the ignore pattern.
As a workaround, you can simply ignore lines starting with #
on either of the files, before passing it to diff
i.e.
diff <(grep -v '^#' f1) <(grep -v '^#' f2)
4d3
< ddd