gitgit-diffgit-blame

git: number of lines *not* changed since specific commit?


There are plenty of answers with great command line fu to find changes (or change statistics), but I'd like to find the opposite: how many lines (per file) have not changed since a particular commit?

The closest I could find is this: How to find which files have not changed since commit? but I'd like to know how many lines (ideally: in each file) have survived unchanged, not which files.

So, basically: can git diff --stat output unchanged lines in addition to insertions and deletions?

Alternatively, I'd imagine that git ls-files, git blame and some awk magic might do the trick, but I haven't been able to figure it out quite yet. -- For example, rather than label each line with the commit number of the last change, can I get git-blame to indicate if this change occurred before or after a given commit? Together with grep and wc -l that would get me there.


Solution

  • Figured it out. The key is that git blame can specify date ranges (see https://git-scm.com/docs/git-blame, section "SPECIFYING RANGES"). Assume 123456 is the commit I want to compare to. With

    git blame 123456..
    

    "lines that have not changed since the range boundary [...] are blamed for that range boundary commit", that is, it will show everything that hasn't changed since that commit as "^123456". Thus, per file, the answer to my question is

    git blame 123456.. $file | grep -P "^\^123456" | wc -l # unchanged since
    git blame 123456.. $file | grep -Pv "^\^123456" | wc -l # new since
    

    Wrapped into bash script to go over all files in repo (git ls-files) and printing pretty:

    #!/bin/bash
    
    total_lines=0;
    total_lines_unchanged=0;
    total_lines_new=0;
    
    echo "--- total unchanged new filename ---"
    
    for file in `git ls-files | \
      <can do some filtering of files here with grep>`
    do
      # calculate stats for this file
      lines=`cat $file | wc -l`
      lines_unchanged=`git blame 123456.. $file | grep -P "^\^123456" | wc -l`
      lines_new=`git blame 123456.. $file | grep -Pv "^\^123456" | wc -l`
    
      # print pretty
      lines_pretty="$(printf "%6d" $lines)"
      lines_unchanged_pretty="$(printf "%6d" $lines_unchanged)"
      lines_new_pretty="$(printf "%6d" $lines_new)"
      echo "$lines_pretty $lines_unchanged_pretty $lines_new_pretty $file"
    
      # add to total
      total_lines=$(($total_lines + $lines))
      total_lines_unchanged=$(($total_lines_unchanged + $lines_unchanged))
      total_lines_new=$(($total_lines_new + $lines_new))
    done
    
    # print total
    echo "--- total unchanged new ---"
    
    lines_pretty="$(printf "%6d" $total_lines)"
    lines_unchanged_pretty="$(printf "%6d" $total_lines_unchanged)"
    lines_new_pretty="$(printf "%6d" $total_lines_new)"
    echo "$lines_pretty $lines_unchanged_pretty $lines_new_pretty TOTAL"
    

    Thanks to Gregg for his answer, which had me look into the options for git-blame more!