rubyintegerimmutability

Ruby: Differences in object identity of Integers


Why is 1.equal? 1 == true but (2**100).equal? (2**100) == false?

And has this to be considered as a bug?


Solution

  • Ruby used to have separate classes Fixnum and Bignum that were unified into the Integer class in Ruby 2.4. However, the concept remains that smaller integers are stored differently than larger integers (62 bits or greater on C Ruby.)

    The 64 bit MRI / YARV version of Ruby has a three tier object storage model:

    1. An 8-byte node that directly encodes TINY or IMMEDIATE objects directly inside it, OR is a pointer to...

    2. A 40-byte RVALUE structure, otherwise known as a slot, which can fully contain a SMALL object as an IMMEDIATE value OR is the starting 40 bytes (data and pointer) of…

    3. Something bigger, which uses the RVALUE data for initial part of the object and a pointer to heap memory block from malloc appropriate for the size for the object.

    What can fit into a TINY node as immediate values? Floats, Boolean values, Short symbols, 62 bit or smaller signed integers; otherwise a pointer to a larger object.

    What can fit into a RVALUE as RVALUE IMMEDIATE value? Smallish Bignum type integers greater than 62 bits, short strings, longer symbols; or a pointer to the rest of the object held in heap allocated from the OS. Larger Bignum type ints may require OS memory allocation to hold.

    Using ObjectSpace, you can see the crossover of memory usage:

    irb(main):080> require 'objspace'
    => false
    irb(main):081> ObjectSpace.memsize_of(2**62-1)  # immediate value
    => 0
    irb(main):082> ObjectSpace.memsize_of(2**62)    # RVALUE IMMEDIATE
    => 40
    irb(main):083> ObjectSpace.memsize_of(2**100)   # RVALUE IMMEDIATE
    => 40
    irb(main):084> ObjectSpace.memsize_of(2**1000)  # RVALUE POINTER
    => 168
    

    The .equal? method is comparing not only mathematical equivalence but also object equivalence. It will be false unless the objects themselves are the same object. The only way the object is the same for two different math operations is for Ruby to realize I have seen this before and point to the object previously used. This is known as interning.

    Observationally, it would appear that any Integer object that is not in a TINY node as an immediate object is treated as a different object -- regardless if it is RVALUE type or an object that the RVALUE is pointing to. Since TINY objects are essentially tagged machine words, these are easier to track. It is likely not efficient to try and catalog every larger Integer to intern those objects.

    As the reasoning held with Bignum vs Fixnum that caused those classes to be hidden, the use of .equal? with Integers is simply not useful. While it seems that it works for smaller values, (2**(99-89)).equal? 2**(1+8) shows that there is something that looks like interning, it is not guaranteed behavior. It is only observed behavior that could change at any time.

    There is a prohibition on Singleton Methods for instances of the Integer class. That would lead one to believe that the interning behavior is the reason -- Maybe. It leads me to believe that the interning behavior seen with 62 bit integer types may become the promised Integer behavior in future. So far, that is not true.

    Use eql? or == to compare two integers.