gitgit-blame

Git blame -L bug?


I am running git blame -L with multiple -L options in order to get line information for non-sequential lines in a single git call.

I believed that this call:

git blame -L38,38 -L40,40 <file>

should be equivalent to the these 2 calls made separately

git blame -L38,38 <file>
git blame -L40,40 <file>

However, I ran across one case where using multiple -L options actually returned lines 38 and 39 rather than then expected lines 38 and 40:

$ git blame -L38,38 -L40,40 <file>
b6543ffe (Some Body 2015-11-24 15:15:03 -0500 38)           SOME CODE
b6543ffe (Some Body 2015-11-24 15:15:03 -0500 39)           SOME OTHER CODE

When I only have a single -L40,40 then git actually returns line 40 correctly:

$ git blame -L40,40 <file>
b6543ffe259 (Some Body 2015-11-24 15:15:03 -0500 40)                SOME CODE

Is there something I'm missing about how -L actually works or is this a git bug?

I tried using both git version 2.7.0.windows.1 and 2.11.0.windows.1.


Solution

  • It should (and you can see how it was implemented in this patch series in 2013)

    But it manifestly is not (probably a bug).
    That leads some project like isaacbernat/pycrastinate to Fix for git blame multiple line ranges

    This fix calls git blame for each line separately.


    Update August 2020

    Before Git 2.29 (Q4 2020), when given more than one target line ranges, "git blame -La,b -Lc,d(man)" was over-eager to coalesce groups of original lines and showed incorrect results, which has been corrected.

    See commit c2ebaa2, commit dd7c611, commit 6dbf0c7 (13 Aug 2020) by Jeff King (peff).
    (Merged by Junio C Hamano -- gitster -- in commit 93121df, 19 Aug 2020)

    blame: only coalesce lines that are adjacent in result

    Reported-by: Nuthan Munaiah
    Signed-off-by: Jeff King

    After blame has finished but before we produce any output, we coalesce groups of lines that were adjacent in the original suspect (which may have been split apart by lines in intermediate commits which went away).

    However, this can cause incorrect output if the lines are not also adjacent in the result.
    For instance, the case in t8003 has:

    ABC
    DEF  
    

    which becomes

    ABC
    SPLIT
    DEF  
    

    Blaming only lines 1 and 3 in the result yields two blame groups (one for each line) that were adjacent in the original.
    That's enough for us to coalesce them into a single group, but that loses information: our output routines assume they're adjacent in the result as well, and we output:

    <oid> 1) ABC
    <oid> 2) SPLIT  
    

    This is nonsense for two reasons:

    • we were asked about line 3, not line 2; we should not output the SPLIT line at all
    • commit <oid> did not touch the SPLIT line at all!
      We found the correct blame for line 3, but the bug is actually in the output stage, which is showing the wrong line number and content from the final file.

    We can fix this by only coalescing when both the suspect and result lines are adjacent. That fixes this bug, but keeps coalescing in cases where want it (e.g., the existing test in t8003 where SPLIT goes away, and the lines really are adjacent in the result).