diffword-diff

newline-ignoring diff / diff across multiple lines / reflow-ignoring diff


Does anybody know of a diff-like tool that can show me the changes between two text files, but ignore changes in whitespace including newlines?

Here's an example:

the quick brown fox jumped over the lazy bear.  the quick brown fox
jumped over the lazy bear.  the quick brown fox jumped over the lazy
bear.  the quick brown fox jumped over the lazy bear.
quick brown fox jumped over the lazy bear.  the quick brown fox jumped
over the lazy bear.  the quick brown fox jumped over the lazy bear.
the quick brown fox jumped over the lazy bear.

All I did was delete one word and reflow it, but "diff -b" detects a change on every line (as it should; I'm not saying this is a bug in diff). But for large LaTeX files this is a major problem; change one word in a long paragraph and the diff you get back is basically useless.

By the way, I'm aware that this requires way more computational power than the usual lines-are-atomic diff. I'm only doing this on small human-generated files and am happy to wait a long time if I have to.


Solution

  • wdiff does word-by-word alignment.

    For an easy-to-read display in a terminal, run

     wdiff -al <file1> <file2> | less
    

    This will show (at least on my system) insertions in <file2>boldfaced and deletions from <file2> underlined.