rubybsearch

Ruby bsearch and bsearch_index bug?


While bsearch and bsearch_index work fine for arrays of ints, they seem to have a problem with arrays of strings.

sorted_strings = %w[aaa aab aac bbb bbc bbd ccc ccd cce]
#=> ["aaa", "aab", "aac", "bbb", "bbc", "bbd", "ccc", "ccd", "cce"]
sorted_strings.bsearch_index { |x| x.start_with? 'a' }
#=> nil
sorted_strings.bsearch_index { |x| x.start_with? 'aaa' }
#=> nil
sorted_strings.bsearch_index { |x| x.start_with? 'b' }
#=> 3
sorted_strings.bsearch_index { |x| x.start_with? 'c' }
#=> 6
sorted_strings.bsearch_index { |x| x.start_with? 'cce' }
#=> 8
sorted_strings.bsearch { |x| x.start_with? 'a' }
#=> nil
sorted_strings.bsearch { |x| x.start_with? 'aa' }
#=> nil
sorted_strings.bsearch { |x| x.start_with? 'aaa' }
#=> nil
sorted_strings.bsearch { |x| x.start_with? 'b' }
#=> "bbb"

Nil should never have been returned. Is this a bug?

Even crazier results are returned when comparing with the combined comparison operator:

sorted_strings = %w[aaa aab aac bbb bbc bbd ccc ccd cce]
#=> ["aaa", "aab", "aac", "bbb", "bbc", "bbd", "ccc", "ccd", "cce"]
sorted_strings.bsearch_index { |x| x <=> 'a' }
#=> nil
sorted_strings.bsearch_index { |x| x <=> 'aaa' }
#=> nil
sorted_strings.bsearch_index { |x| x <=> 'b' }
#=> nil
sorted_strings.bsearch_index { |x| x <=> 'bbb' }
#=> nil
sorted_strings.bsearch_index { |x| x <=> 'bbc' }
#=> 4

I do not understand why the following works, given that the above does not:

sorted_strings = %w[aaa aab aac bbb bbc bbd ccc ccd cce]
#=> ["aaa", "aab", "aac", "bbb", "bbc", "bbd", "ccc", "ccd", "cce"]
sorted_strings.bsearch_index { |x| x >= 'a' }
#=> 0
sorted_strings.bsearch_index { |x| x >= 'aa' }
#=> 0
sorted_strings.bsearch_index { |x| x >= 'aaa' }
#=> 0
sorted_strings.bsearch_index { |x| x >= 'b' }
#=> 3
sorted_strings.bsearch_index { |x| x >= 'bbc' }
#=> 4
sorted_strings.bsearch_index { |x| x >= 'cc' }
#=> 6
sorted_strings.bsearch_index { |x| x >= 'cce' }
#=> 8

This is the version of Ruby I used:

$ ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]

Solution

  • In order to use bsearch / bsearch_index, the elements have to be ordered in a specific way. According to the docs for Binary Searching:

    Find-Minimum Mode
    • [...] all false-evaluating elements precede all true-evaluating elements.
    Find-Any Mode
    • All positive-evaluating elements precede all zero-evaluating elements.
    • All positive-evaluating elements precede all negative-evaluating elements.
    • All zero-evaluating elements precede all negative-evaluating elements.

    However, in your examples the mapping is reversed, e.g.:

    sorted_strings.map { |x| x.start_with? 'a' }
    #=> [true, true, true, false, false, false, false, false, false]
    
    sorted_strings.map { |x| x <=> 'bbb' }
    #=> [-1, -1, -1, 0, 1, 1, 1, 1, 1]
    

    It might help to use map to find a mapping that can be used with Ruby's binary search methods.