rubyhashhash-of-hashes

Accessing elements of nested hashes in ruby


I'm working a little utility written in ruby that makes extensive use of nested hashes. Currently, I'm checking access to nested hash elements as follows:

structure = { :a => { :b => 'foo' }}

# I want structure[:a][:b]

value = nil

if structure.has_key?(:a) && structure[:a].has_key?(:b) then
  value = structure[:a][:b]
end

Is there a better way to do this? I'd like to be able to say:

value = structure[:a][:b]

And get nil if :a is not a key in structure, etc.


Solution

  • The way I usually do this these days is:

    h = Hash.new { |h,k| h[k] = {} }
    

    This will give you a hash that creates a new hash as the entry for a missing key, but returns nil for the second level of key:

    h['foo'] -> {}
    h['foo']['bar'] -> nil
    

    You can nest this to add multiple layers that can be addressed this way:

    h = Hash.new { |h, k| h[k] = Hash.new { |hh, kk| hh[kk] = {} } }
    
    h['bar'] -> {}
    h['tar']['zar'] -> {}
    h['scar']['far']['mar'] -> nil
    

    You can also chain indefinitely by using the default_proc method:

    h = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) }
    
    h['bar'] -> {}
    h['tar']['star']['par'] -> {}
    

    The above code creates a hash whose default proc creates a new Hash with the same default proc. So, a hash created as a default value when a lookup for an unseen key occurs will have the same default behavior.

    EDIT: More details

    Ruby hashes allow you to control how default values are created when a lookup occurs for a new key. When specified, this behavior is encapsulated as a Proc object and is reachable via the default_proc and default_proc= methods. The default proc can also be specified by passing a block to Hash.new.

    Let's break this code down a little. This is not idiomatic ruby, but it's easier to break it out into multiple lines:

    1. recursive_hash = Hash.new do |h, k|
    2.   h[k] = Hash.new(&h.default_proc)
    3. end
    

    Line 1 declares a variable recursive_hash to be a new Hash and begins a block to be recursive_hash's default_proc. The block is passed two objects: h, which is the Hash instance the key lookup is being performed on, and k, the key being looked up.

    Line 2 sets the default value in the hash to a new Hash instance. The default behavior for this hash is supplied by passing a Proc created from the default_proc of the hash the lookup is occurring in; ie, the default proc the block itself is defining.

    Here's an example from an IRB session:

    irb(main):011:0> recursive_hash = Hash.new do |h,k|
    irb(main):012:1* h[k] = Hash.new(&h.default_proc)
    irb(main):013:1> end
    => {}
    irb(main):014:0> recursive_hash[:foo]
    => {}
    irb(main):015:0> recursive_hash
    => {:foo=>{}}
    

    When the hash at recursive_hash[:foo] was created, its default_proc was supplied by recursive_hash's default_proc. This has two effects:

    1. The default behavior for recursive_hash[:foo] is the same as recursive_hash.
    2. The default behavior for hashes created by recursive_hash[:foo]'s default_proc will be the same as recursive_hash.

    So, continuing in IRB, we get the following:

    irb(main):016:0> recursive_hash[:foo][:bar]
    => {}
    irb(main):017:0> recursive_hash
    => {:foo=>{:bar=>{}}}
    irb(main):018:0> recursive_hash[:foo][:bar][:zap]
    => {}
    irb(main):019:0> recursive_hash
    => {:foo=>{:bar=>{:zap=>{}}}}