javagitsvngit-bisectbisect

Could someone explain what an SVN bisect algorithm is? Theoretically and through code snippet


I want to understand what an SVN bisect / git-bisect algorithm is..I have tried to search online resources but unable to get a good problem statement and solution.


Solution

  • Briefly, the idea of bisect is a binary search.

    The idea of bisect is based on several assumptions:

    1. There is a particular application behaviour (for simplicity, let's call it "defect") and you want to find its reason.
    2. Code analysis and debugging needs too much efforts.
    3. Building any version and running tests against it is much easier than code analysis and debugging.
    4. You know some code revision in the past without this defect.

    A straight forward approach is to build the previous revision and test if it has this defect. If has, then test one revision earlier. You continue until you find the revision that introduced this bug. You know all the changes in this commit. That's why it is easier to find what change exactly caused the defect. Also debugging can be much easier, because you can focus on (usually) a few changes.

    But if the revision known to have no defect is 200 revisions in the past before the recent revision, you may need to repeat "build and test" procedure 200 times (if you have no luck).

    Let's name the revision without defect R0, revision with defect R200.

    Bisect allows to make it faster. You "tell" bisect what revision has defect and what revision has no defect. Then bisect calculate the number of commits between these revisions. In our case these are 200 commits. Bisect takes a revision that is 100 revisions before the revision with defect and checks it out. Means, it checks out revision R100. You build it and test. Then you "tell" bisect if this revision contains defect or not. If there is defect, then it means it was introduced between versions R0 and R100. If there is no defect, it means that defect was introduced later, after revision R100, between R100 and R200.

    Suppose you told bisect there is defect in R100. Then bisect takes the middle of the range R0 - R100. This will be revision R50. Bisect checks it out. You build it and test and tell the result to bisect.

    Suppose R50 has no defect. Means, it was introduced between revisions R50 and R 100. Bisect takes again the middle, revision R75. You build it and test.

    Suppose R75 has defect. Means it was introduced between R50 and R75. Bisect checks out R63. Ans so on.

    In total you need log(200) = 8 steps. If you would check every version, you would build and test up to 200 revisions, which would take much longer.

    In case of Git, you first start bisect procedure with command

    git bisect start
    

    Then you check out revision with defect and run command

    git bisect bad
    

    Then you tell bisect what revision was without defect:

    git bisect good R0
    

    By this command Git bisect will find the middle, R100, and check it out. You build and test it and, if it contains defect, run command

    git bisect bad
    

    Otherwise command

    git bisect good
    

    And thus continue until you the revision found that introduced the defect.