I am finding extremely difficult to deal with vectors of HashMaps in rust. I want to use vectors of HashMaps as an implementation of a sparse matrix container according to constraints that I, hopefully, will be able to explain below.
Most importantly I want to minimize access and modification time, including amortized allocations. Then I want to reduce memory usage as much as possible without increasing the access time, or at least not significantly.
About execution time: I will definitely do a lot of lookups so I need the best access time possible which would be given by a vector as opposed to HashMap<(INDEX, KEY), Value> as mentioned in the comments. Correct me if I am wrong: hashing and collisions handling would definitely impact performance. My key is a complex struct. Although I am considering to simplify it in an enum for the key so maybe hashing will be cheaper in this case, but I am not sure if that will be possible.
About memory: I need to avoid unnecessary memory overhead with empty HashMap which requires 48 bytes in memory. My vector has size in the order of 100 million thus it can easily consume tens of GB. To get out of this wasteful situation, I opted to use Option<Box>. As I understand, I require nullability and cannot use Box directly in this case because this type doesn't allow for nullability, hence it will force me to use empty HashMap and reserve 48 bytes somewhere in memory plus an additional 8 bytes per element because of the Box itself. With Option<Box>, the None variant consumes only 8 bytes, resulting in up to 6 times less memory if all the elements were empty HashMaps.
I am very open to alternative approaches to a sparse matrix implementation that satisfies my needs, but in this question I want to focus about rust patterns for memory efficient representations of vector of hash maps that also avoid dealing with what seems to me unnecessarily verbose chains of Option and pointer types (Box/Rc and friends). Is there any better approach for handling vectors of nullable HashMaps in rust?
According to the Rust docs, creating an empty HashMap
does not incur a heap allocation. In fact, Rust containers have this behavior in general. There should be no need to wrap your HashMap
s in an Option
.
The fact that nullability must be made explicit in Rust is to ensure that API contracts are clear to both the library writer and the user. As with all things, "there is no free lunch" and making this explicit can require a few unwrap
s here and there.
Finally, note that while HashMap
is not Copy
it is Clone
if its keys, values, and state are Clone
(the default state is, in fact, clone). As stated in the Rust docs, Clone
"[d]iffers from Copy
in that Copy
is implicit and an inexpensive bit-wise copy, while Clone
is always explicit and may or may not be expensive."