pythonbisection

Find where f(x) changes in a list with bisection


I'm trying to implement, in Python, something similar to git bisect, but with basically a list of directories.

I have a (long) list of version numbers like this: ['1.0', '1.14', '2.3', '3.1', '4']

I have a function works() which takes a version number, and returns a value.

[works(x) for x in my_list] would look like: ['foo', 'foo', 'foo', 'bar', 'bar'] ... but running works() is very expensive.

I would like to do some kind of bisect which will find the change boundary.


Solution

  • You could simply use binary search:

    def binary_f(f,list):
        frm = 0
        to = len(list)
        while frm < to:
            mid = (frm+to)>>1
            if f(list[mid]):
                to = mid
            else:
                frm = mid+1
        return frm
    

    It will return the first index i for which bool(f(list[i])) is True.

    Of course the function assumes that the the map of f on the list is of the form:

    f(list) == [False,False,...,False,True,True,...,True]
    

    If this is not the case, it will usually find a swap but which one is rather undefined.

    Say f is simply "the version is 2 or higher" so lambda v:v >= '2', then it will return:

    >>> binary_f(lambda v:v >= '2',['1.0', '1.14', '2.3', '3.1', '4'])
    2
    

    So index 2. In case the entire list would return with False objects, it will **return len(list). Since it "assumes" the element just outside the list will be evaluated to True:

    >>> binary_f(lambda v:v >= '4.2',['1.0', '1.14', '2.3', '3.1', '4'])
    5
    

    Of course in your example f is works.

    Experiments:

    >>> binary_f(lambda v:v >= '2',['1.0', '1.14', '2.3', '3.1', '4'])
    2
    >>> binary_f(lambda v:v >= '0',['1.0', '1.14', '2.3', '3.1', '4'])
    0
    >>> binary_f(lambda v:v >= '1',['1.0', '1.14', '2.3', '3.1', '4'])
    0
    >>> binary_f(lambda v:v >= '1.13',['1.0', '1.14', '2.3', '3.1', '4'])
    1
    >>> binary_f(lambda v:v >= '2.4',['1.0', '1.14', '2.3', '3.1', '4'])
    3
    >>> binary_f(lambda v:v >= '3',['1.0', '1.14', '2.3', '3.1', '4'])
    3
    >>> binary_f(lambda v:v >= '3.2',['1.0', '1.14', '2.3', '3.1', '4'])
    4
    >>> binary_f(lambda v:v >= '4.2',['1.0', '1.14', '2.3', '3.1', '4'])
    5
    

    (I here of course did a very cheap version check, but it works of course for more sophisticated predicates).

    Since this is binary search, it will run in O(log n) with n the number of items in the list whereas linear search can result in O(n) checks (which is usually more expensive).

    In case the list contains two values and you want to find the swap, you can simply first compute the value for index 0:

    val0 = f(list[0])
    

    and then provide binary_f:

    binary_f(lambda v:works(v) != val0,list)
    

    Or putting it into a nice function:

    def binary_f_val(f,list):
        val0 = f(list[0])
        return binary_f(lambda x:f(x) != val0,list)