rustownership

Entry::Occupied.get() returns a value referencing data owned by the current function even though hashmap should have the ownership


My goal was to implement the suggested improvement on the cacher struct of the Rust book chapter 13.1, that is, creating a struct which takes a function and uses memoization to reduce the number of calls of the given function. To do this, I created a struct with a HashMap and two methods, one constructor and one which is responsible of the memoization:

use std::collections::hash_map::Entry;
use std::collections::HashMap;
use std::hash::Hash;

struct Cacher<T, U, V>
where
    T: Fn(&U) -> V,
    U: Eq + Hash,
{
    calculation: T,
    map: HashMap<U, V>,
}

impl<T, U, V> Cacher<T, U, V>
where
    T: Fn(&U) -> V,
    U: Eq + Hash,
{
    fn new(calculation: T) -> Cacher<T, U, V> {
        Cacher {
            calculation,
            map: HashMap::new(),
        }
    }

    fn value(&mut self, arg: U) -> &V {
        match self.map.entry(arg) {
            Entry::Occupied(occ_entry) => occ_entry.get(),
            Entry::Vacant(vac_entry) => {
                let arg_ref = vac_entry.key();
                let result = (self.calculation)(arg_ref);
                vac_entry.insert(result)
            }
        }
    }
}

I used the Entry enum, because I didn't find a better way of deciding if the HashMap contains a key and – if it doesn't – calculating the value and inserting it into the HashMap as well as returning a reference to it.

If I want to compile the code above, I get an error which says that occ_entry is borrowed by its .get() method (which is fine by me) and that .get() "returns a value referencing data owned by the current function".

My understanding is that the compiler thinks that the value which occ_entry.get() is referencing is owned by the function value(...). But shouldn't I get a reference of the value of type V, which is owned by the HashMap? Is the compiler getting confused because the value is owned by the function and saved as result for a short moment?

let result = (self.calculation)(arg_ref);
vac_entry.insert(result)

Please note that it is necessary to save the result temporarily because the insert method consumes the key and such arg_ref is not valid anymore. Also I acknowledge that the signature of value can be problematic (see Mutable borrow from HashMap and lifetime elision) but I tried to avoid a Copy trait bound.


Solution

  • Let's take a look at OccupiedEntry::get()'s signature:

    pub fn get(&self) -> &V
    

    What this signature is telling us is that the reference obtained from the OccupiedEntry can only live as long as the OccupiedEntry itself. However, the OccupiedEntry is a local variable, thus it's dropped when the function returns.

    What we want is a reference whose lifetime is bound to the HashMap's lifetime. Both Entry and OccupiedEntry have a lifetime parameter ('a), which is linked to the &mut self parameter in HashMap::entry. We need a method on OccupiedEntry that returns a &'a V. There's no such method, but there's one that returns a &'a mut V: into_mut. A mutable reference can be implicitly coerced to a shared reference, so all we need to do to make your method compile is to replace get() with into_mut().

    fn value(&mut self, arg: U) -> &V {
        match self.map.entry(arg) {
            Entry::Occupied(occ_entry) => occ_entry.into_mut(),
            Entry::Vacant(vac_entry) => {
                let arg_ref = vac_entry.key();
                let result = (self.calculation)(arg_ref);
                vac_entry.insert(result)
            }
        }
    }
    

    Rust 1.50 introduced the or_insert_with_key method on Entry. Using this method, the code can be drastically simplified:

    fn value(&mut self, arg: U) -> &V {
        self.map
            .entry(arg)
            .or_insert_with_key(|arg_ref| (self.calculation)(arg_ref))
    }