pythoncomparisontuplesbisect

using bisect on list of tuples but compare using first value only


I read that question about how to use bisect on a list of tuples, and I used that information to answer that question. It works, but I'd like a more generic solution.

Since bisect doesn't allow to specify a key function, if I have this:

import bisect
test_array = [(1,2),(3,4),(5,6),(5,7000),(7,8),(9,10)]

and I want to find the first item where x > 5 for those (x,y) tuples (not considering y at all, I'm currently doing this:

bisect.bisect_left(test_array,(5,10000))

and I get the correct result because I know that no y is greater than 10000, so bisect points me to the index of (7,8). Had I put 1000 instead, it would have been wrong.

For integers, I could do

bisect.bisect_left(test_array,(5+1,))

but in the general case when there may be floats, how to to that without knowing the max values of the 2nd element?

test_array = [(1,2),(3,4),(5.2,6),(5.2,7000),(5.3,8),(9,10)]

I have tried this:

bisect.bisect_left(test_array,(min_value+sys.float_info.epsilon,))

and it didn't work, but I have tried this:

bisect.bisect_left(test_array,(min_value+sys.float_info.epsilon*3,))

and it worked. But it feels like a bad hack. Any clean solutions?


Solution

  • As of Python 3.10, bisect finally supports key! So if you're on 3.10 or up, just use key. But if you're not...

    bisect supports arbitrary sequences. If you need to use bisect with a key, instead of passing the key to bisect, you can build it into the sequence:

    class KeyList(object):
        # bisect doesn't accept a key function before 3.10,
        # so we build the key into our sequence.
        def __init__(self, l, key):
            self.l = l
            self.key = key
        def __len__(self):
            return len(self.l)
        def __getitem__(self, index):
            return self.key(self.l[index])
    

    Then you can use bisect with a KeyList, with O(log n) performance and no need to copy the bisect source or write your own binary search:

    bisect.bisect_right(KeyList(test_array, key=lambda x: x[0]), 5)