rubyalgorithmhashtransformationnormalization

Fill or complete a hash in ruby with all sibling keys


I have a rails API where I'm able to query vehicles and count and group by different attributes. I would like to fill the response with zero values when a group by is used.

Here is a simple example:

data = {
  AUDI: {
    Petrol: 379,
    Diesel: 326,
    Electric: 447
  },
  TESLA: {
    Electric: 779
  }
}

Since Tesla doesn't have any petrol or diesel cars, then that key isn't even included in the response. I believe this is a consequence of the group_by in postgres, which doesn't include zero counts in the results.

I'm trying to create a function which can "fill" the hash with these missing zero values eg.

fill_hash(data)

should return

data = {
  AUDI: {
    Petrol: 379,
    Diesel: 326,
    Electric: 447
  },
  TESLA: {
    Petrol: 0,
    Diesel: 0,
    Electric: 779
  }
}

This simple case I got working with 3 methods:

def collect_keys(hash, keys = [])
  hash.each do |_, value|
    if value.is_a?(Hash)
      keys.concat(value.keys)
      collect_keys(value, keys)
    end
  end
  keys.uniq
end

def fill_missing_keys(hash, all_keys, default_value = 0)
  hash.each do |_, value|
    if value.is_a?(Hash)
      all_keys.each do |k|
        value[k] = default_value unless value.key?(k)
      end
      fill_missing_keys(value, all_keys, default_value)
    end
  end
  hash
end

def fill_hash(hash, default_value = 0)
  all_keys = collect_keys(hash)
  fill_missing_keys(hash, all_keys, default_value)
end

# Example usage:
hash = { a: { x: 1 }, b: { y: 1 } }
filled_hash = complete_hash(hash)
puts filled_hash
# Output should be: {:a=>{:x=>1, :y=>0}, :b=>{:y=>1, :x=>0}}

My problem is that the hashes can get more complicated. The simple case only grouped by 2 attributes, but here is an example where we group by 3 attributes.

{
  AUDI: {
    Deregistered: {
      Diesel: 56
    },
    Registered: {
      Petrol: 379,
      Diesel: 270,
      Electric: 447
    }
  },
  TESLA: {
    Registered: {
      Electric: 779
    }
  }
}

My desired output is:

{
  AUDI: {
    Deregistered: {
      Petrol: 0,
      Diesel: 56,
      Electric: 0
    },
    Registered: {
      Petrol: 379,
      Diesel: 270,
      Electric: 447
    }
  },
  TESLA: {
    Deregistered: {
      Petrol: 0,
      Diesel: 0,
      Electric: 0
    },
    Registered: {
      Petrol: 0,
      Diesel: 0,
      Electric: 779
    }
  }
}

Eg. there's keys missing in both last and second to last layer.


Solution

  • Here's what I came up with, which seems to satisfy the request, including identical order of keys.

    # This will build a "Default" Hash
    # for Example: 
    # {:AUDI=>{:Petrol=>0, :Diesel=>0, :Electric=>0},
    #  :TESLA=>{:Petrol=>0, :Diesel=>0, :Electric=>0}}
    def build_default_hash(obj, default_value: 0)
      return obj.transform_values {default_value.dup} unless obj.values.first.is_a?(Hash)
      m = obj.values.reduce(&:deep_merge)
      obj.keys.product([build_default_hash(m, default_value:)]).to_h
    end 
    
    # deep_merge the default hash with the existing Hash
    def deep_normalize_hash(h, default_value:0) 
      build_default_hash(h, default_value:).deep_merge(h)
    end
    

    Usage:(Working Example)

    h = {
      AUDI: {
        Deregistered: {
          Diesel: 56
        },
        Registered: {
          Petrol: 379,
          Diesel: 270,
          Electric: 447
        }
      },
      TESLA: {
        Registered: {
          Electric: 779
        }
      }
    }
    deep_normalize_hash(h)
    #=> {:AUDI=>
    #     {:Deregistered=>{:Diesel=>56, :Petrol=>0, :Electric=>0},
    #      :Registered=>{:Diesel=>270, :Petrol=>379, :Electric=>447}},
    #    :TESLA=>
    #     {:Deregistered=>{:Diesel=>0, :Petrol=>0, :Electric=>0},
    #      :Registered=>{:Diesel=>0, :Petrol=>0, :Electric=>779}}}
    

    You could easily blend this into the Hash class, such that usage could be h.deep_normalize, and while I would generally discourage doing so, given that this is in a rails context no one would even notice a little more core class manipulation.