I'm trying to make an algorithm that finds the best position to insert the target into the already sorted array.
The goal is to either return the position of the item if it exists in the list, else return the position it would go into to keep the list sorted.
So say I have a list:
0 1 2 3 4 5 6
---------------------------------
| 1 | 2 | 4 | 9 | 10 | 39 | 100 |
---------------------------------
And my target item is 14
It should return an index position of 5
Pseudo-code I currently have:
array = generateSomeArrayOfOrderedNumbers()
number findBestIndex(target, start, end)
mid = abs(end - start) / 2
if (mid < 2)
// Not really sure what to put here
return start + 1 // ??
if (target < array[mid])
// The target belongs on the left side of our list //
return findBestIndex(target, start, mid - 1)
else
// The target belongs on the right side of our list //
return findBestIndex(target, mid + 1, end)
I not really sure what to put at this point. I tried to take a binary search approach to this, but this is the best I could come up with after 5 rewrites or so.
There's several problems with your code:
mid = abs(end - start) / 2
This is not the middle between start
and end
, it's half the distance between them (rounded down to an integer). Later you use it like it was indeed a valid index:
findBestIndex(target, start, mid - 1)
Which it is not. You probably meant to use mid = (start + end) // 2
or something here.
You also miss a few indices because you skip over the mid:
return findBestIndex(target, start, mid - 1)
...
return findBestIndex(target, mid + 1, end)
Your base case must now be expressed a bit differently as well. A good candidate is the condition
if start == end
Because now you definitely know you're finished searching. Note that you also should consider the case where all the array elements are smaller than target
, so you need to insert it at the end.
Binary search is something that is surprisingly hard to get right if you've never done it before. I usually use the following pattern if I do a binary search:
lo, hi = 0, n // [lo, hi] is the search range, but hi will never be inspected.
while lo < hi:
mid = (lo + hi) // 2
if check(mid): hi = mid
else: lo = mid + 1
Under the condition that check
is a monotone binary predicate (it is always false
up to some point and true
from that point on), after this loop, lo == hi
will be the first number in the range [0..n]
with check(lo) == true
. check(n)
is implicitely assumed to be true (that's part of the magic of this approach).
So what is a monotone predicate that is true
for all indices including and after our target position and false
for all positions before?
If we think about it, we want to find the first number in the array that is larger than our target, so we just plug that in and we're good to go:
lo, hi = 0, n
while lo < hi:
mid = (lo + hi) // 2
if (a[mid] > target): hi = mid
else: lo = mid + 1
return lo;