rust

HashMap with multiple keys


As a learning exercise, I thought of trying to make a Map type (e.q. HashMap) that had multiple keys and could look up values in several different ways. I thought I had an elegant solution, but I quickly found a glaring hole in it:

use std::borrow::Borrow;
use std::collections::HashMap;

/// Values used as keys in a Map, e.g. `HashMap<Key, _>`.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
struct Key {
    session_id: usize,
    screen_name: String,
}

/// Different ways to look up a value in a Map keyed by `Key`.
///
/// Each variant must constitute a unique index.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
enum KeyQuery<'s> {
    SessionId(usize),
    ScreenName(&'s str),
}

impl<'s> Borrow<KeyQuery<'s>> for Key {
    fn borrow(&self) -> &KeyQuery<'s> {
        // ???
    }
}

fn main() {
    let mut map: HashMap<Key, ()> = HashMap::new();
    map.insert(
        Key {
            session_id: 1248,
            screen_name: "nombe_hombre".to_string(),
        },
        (),
    );

    // Look up values by either field of Key.
    map.get(&KeyQuery::SessionId(1248));
    map.get(&KeyQuery::ScreenName("nombre_hombre"));
}

In plain English, it's easy to tell Rust, "If you need a KeyQuery::SessionId, here's how to borrow one from a Key," and the same goes for a KeyQuery::ScreenName. Is there a reasonable way to express this in code? I'm not sure if I'm missing something obvious or hitting a (probably intentional) limitation of the type system.

Trait objects?

I've saved the related question How To Implement Hash Map With Two Keys? for future reading. When reading through it briefly, I found myself a bit out of my depth. I can start to imagine how to write a simple trait with methods for each field of Key that implements PartialEq, Eq, and Hash. I'm not sure if this is the right/best answer, though. I'm hoping there might be a more direct approach.

Just use a database?

I'm aware that this is pushing the limits of what's sensible and that it might make more sense to simply query a database. I would still like to learn how best to express this idea directly in Rust.


Solution

  • There is a fundamental barrier preventing this from working, and its not due to anything in Rust. The way a hashmap queries for an element is to hash the key your searching for to derive some index that will point directly (or near directly - hashmap algorithms differ) to where the element is or would be. This means the hash of whatever key you want to query with must match what the hash of stored key you're looking for.

    For this to work in your desired structure, both the session_id and screen_name would have to hash to the exact same value as Key to satisfy the expectations from the hashmap. This will be impossible in the general case (unless you have an atrocious hash algorithm that returns essentially constants, but then a hashmap is of no benefit at all).

    In Rust you can see this concept expressed in the documentation for Borrow which is used for HashMap::get (emphasis mine):

    Further, when providing implementations for additional traits, it needs to be considered whether they should behave identically to those of the underlying type as a consequence of acting as a representation of that underlying type. Generic code typically uses Borrow<T> when it relies on the identical behavior of these additional trait implementations. These traits will likely appear as additional trait bounds.

    In particular Eq, Ord and Hash must be equivalent for borrowed and owned values: x.borrow() == y.borrow() should give the same result as x == y.

    The trick shown in the answer to How To Implement Hash Map With Two Keys? still works because both keys are always used when querying, its only the structure of references that differ. That will not work in this case.


    The way to express a hashmap with multiple lookup methods is to have multiple hashmaps (that usually refer to shared values as to not duplicate data).