rubybsearch

Using bsearch to find index for inserting new element into sorted array


I have a sorted unique array and want to efficiently insert an element into it that is not in the array like this:

a = [1,2,4,5,6]
new_elm = 3
insert_at = a.bsearch_index {|x| x > new_elm } # => 2
a.insert(insert_at, new_elm) # now a = [1,2,3,4,5,6]

The method bsearch_index doesn't exist: only bsearch, which returns the matching element rather than the matching element's index. Is there any built in way to accomplish this?


Solution

  • You can use the Enumerator object returned by each_with_index to return a nested array of [value, index] pairs and then conduct your binary search on that array:

    a = [1,2,4,5,6]
    new_elm = 3
    
    index = [*a.each_with_index].bsearch{|x, _| x > new_elm}.last
    => 2
    
    a.insert(index, new_elm)
    

    EDIT:

    I've run some simple benchmarks in response to your question with an array of length 1e6 - 1:

    require 'benchmark'
    
    def binary_insert(a,e)
      index = [*a.each_with_index].bsearch{|x, _| x > e}.last
      a.insert(index, e)
    end
    
    a = *1..1e6
    b = a.delete_at(1e5)
    => 100001
    
    Benchmark.measure{binary_insert(a,b)}
    => #<Benchmark::Tms:0x007fd3883133d8 @label="", @real=0.37332, @cstime=0.0, @cutime=0.0, @stime=0.029999999999999805, @utime=0.240000000000002, @total=0.2700000000000018> 
    

    With this in mind, you might consider trying using a heap or a trie instead of an array to store your values. Heaps in particular have constant insertion and removal time complexities, making them ideal for large storage applications. Check out this article here: Ruby algorithms: sorting, trie, and heaps