I have the following code which does not compile:
// TODO: Return Result, remove `.expect`s
fn to_blender_subfile<'a>(
filepath: &str,
transform: Transform,
visited_file_cache: &'a mut HashMap<&'a str, Vec<Command>>,
blender: &mut Blender,
) {
// TODO: Support colors and backface culling
let commands = visited_file_cache.get(filepath).unwrap();
// "unwrap" is enforced by the wrapper function
for command in commands.iter() {
if let Command::Subfile { filepath, .. } = command {
if !visited_file_cache.contains_key(filepath.as_str()) {
let mut file = std::fs::File::open(filepath).expect("Failed to open file");
let mut file_contents = String::new();
let contents_len = file
.read_to_string(&mut file_contents)
.expect("Failed to read file into buffer");
assert_eq!(contents_len, file_contents.len()); // TODO: Check this
// "parse_file" is defined elsewhere
let commands = parse_file(&file_contents);
if let Some(_) = visited_file_cache.insert(filepath.as_str(), commands) {
std::unreachable!();
}
}
}
}
// ... rest of function
}
The error I get from rustc
is:
cannot borrow `*visited_file_cache` as mutable because it is also borrowed as immutable
referring to the fact that I both do visited_file_cache.insert(filepath.as_str(), commands)
and let commands = visited_file_cache.get(filepath).unwrap();
while claiming visited_file_cache: &'a mut HashMap<&'a str, Vec<Command>>
.
My understanding of what's going on, at least at a high level, is that the borrow checker is protecting me from mutating visited_file_cache
while also referring to its values for its keys, since in general its keys may survive its values which can lead to dangling pointers.
This is all well and good, however by the nature of the code, if a key-value pair has been inserted into visited_file_cache
it cannot be removed (i.e. it lives at least as long as the hashmap itself). I understand that the compiler cannot use this information, hence why it complains, but is there a way to express to the compiler that a region grows monotonically? I could then tell the compiler that visited_file_cache
is backed by a monotonic allocator and its insert
method is nondestructive, thus getting rid of the above error.
Can the compiler understand such structures, or are there predefined structures in the standard library that enforce the above behavior while using unsafe
?
I am aware that I can trivially circumvent this issue by copying strings and making the keys of visited_file_cache
owned, and I'm pretty sure I could also use reference counting. However, I don't think the infrastructure for either of those two solutions is necessary for my use case.
N.B. I am new to Rust (haven't looked the Rust Book yet, though I have looked at the Rustonomicon), but I have a C and Haskell background.
First of all, if you only want the code to compile, then indeed you can relatively easily get there with a bit of cloning & owned keys: see the playground.
It is a bit boring, though, isn't it?
So, what do we need to be able to iterate & refer to the key & values of a map while modifying the map?
(Not be mistaken with "lick it, lick it, lick it!", no geologist was hurt in the making of this answer)
In Rust, it's safe to leak. In fact, there's an explicit function to do so. So we're going to leak all things:
&'static str
.&'static [Command]
.Leaking a String
can be achieved by going:
let filename: Box<str> = _string_;
let filename = Box::leak(filename);
And leaking a slice can be achieved by going:
let commands: Box<[Command]> = _vec_;
let commands = Box::leak(commands);
And that's it. We can now have a HashMap<&'static str, &'static [Command]>
where filepath: &'static str
in Command::Subfile
and we're good to go.
Leaking is safe, so it's cool, but... it doesn't really lend itself to repetition. So we need another option.
Another option would be for the map to guarantee that inserted keys & values will (1) not be moved and (2) not be mutated (while shared).
What's pretty cool about Rust is that you can, in fact, expressed such invariants (shared vs not-shared) at the type-level, so we get:
impl<K, V> PushMap<K, V> {
// Only reads & `insert_if_new` are possible for the duration of the borrow.
fn get(&self, key: &K) -> Option<&V>;
// Returns `value` if the key is already present in the map.
//
// This is unlike `insert` (which returns the previous value), hence the
// the different name.
fn insert_if_new(&self, key: K, value: V) -> Option<(K, V)>;
// Mirror API of standard HashMap.
fn get_mut(&mut self, key: &K) -> Option<&mut V>;
fn insert(&mut self, key: K, value: V) -> Option<V>;
fn remove(&mut self, key: &K) -> Option<V>;
}
The use of insert_new
does require internal mutability. At the bottom this means UnsafeCell
, and the Cell
and RefCell
wrappers won't cut it here, so we'll need to make our own.
This also means that it will NOT be possible to shared &PushMap
across multiple threads (while it's possible with &HashMap
) unless we also make it a concurrent map. It's a trade-off.
The exact implementation is left as an exercise to the reader. From the literature, a typical hash table implementation, with linked-lists as buckets, will fit the bill, although there's overhead in using linked-lists. Otherwise, an implementation atop the Jagged Array concept is also possible, see the jagged
repository for an example.