I read that question about how to use bisect
on a list of tuples, and I used that information to answer that question. It works, but I'd like a more generic solution.
Since bisect
doesn't allow to specify a key
function, if I have this:
import bisect
test_array = [(1,2),(3,4),(5,6),(5,7000),(7,8),(9,10)]
and I want to find the first item where x > 5
for those (x,y)
tuples (not considering y
at all, I'm currently doing this:
bisect.bisect_left(test_array,(5,10000))
and I get the correct result because I know that no y
is greater than 10000, so bisect
points me to the index of (7,8)
. Had I put 1000
instead, it would have been wrong.
For integers, I could do
bisect.bisect_left(test_array,(5+1,))
but in the general case when there may be floats, how to to that without knowing the max values of the 2nd element?
test_array = [(1,2),(3,4),(5.2,6),(5.2,7000),(5.3,8),(9,10)]
I have tried this:
bisect.bisect_left(test_array,(min_value+sys.float_info.epsilon,))
and it didn't work, but I have tried this:
bisect.bisect_left(test_array,(min_value+sys.float_info.epsilon*3,))
and it worked. But it feels like a bad hack. Any clean solutions?
As of Python 3.10, bisect
finally supports key
! So if you're on 3.10 or up, just use key
. But if you're not...
bisect
supports arbitrary sequences. If you need to use bisect
with a key, instead of passing the key to bisect
, you can build it into the sequence:
class KeyList(object):
# bisect doesn't accept a key function before 3.10,
# so we build the key into our sequence.
def __init__(self, l, key):
self.l = l
self.key = key
def __len__(self):
return len(self.l)
def __getitem__(self, index):
return self.key(self.l[index])
Then you can use bisect
with a KeyList
, with O(log n) performance and no need to copy the bisect
source or write your own binary search:
bisect.bisect_right(KeyList(test_array, key=lambda x: x[0]), 5)