In computer graphics, one of the most basic patterns is creating several buffers for vertex attributes and an index buffer that groups these attributes together.
In Rust, this basic pattern looks like this:
struct Position { ... }
struct Uv { ... }
struct Vertex {
pos: usize,
uv: usize
}
// v_vertex[_].pos is an index into v_pos
// v_vertex[_].uv is an index into v_uv
struct Object {
v_pos: Vec<Position>, // attribute buffer
v_uv: Vec<Uv>, // attribute buffer
v_vertex: Vec<Vertex> // index buffer
}
However, this pattern leaves a lot to be desired. Any operations to the attribute buffers that modifies existing data is going to be primarily concerned with making sure the index buffer isn't invalidated. In short, it leaves most of the promises the rust compiler is capable of making on the table.
Making a scheme like this work isn't impossible. For example, the struct could include a HashMap
that keeps track of changed indices and rebuilds the vectors when it gets too large. However, all the workarounds inevitably feel like another hack that doesn't address the underlying problem: there is no compile-checked guarantee that I'm not introducing data races or that the "reference" hasn't been invalidated somewhere else on accident.
When I first approached this problem when moving from C++ to Rust, I tried to make the Vertex
object hold references to the attributes. That looked something like this:
struct Position { ... }
struct Uv { ... }
struct Vertex {
pos: &Position,
uv: &Uv
}
// obj.v_vertex[_].pos is a reference to an element of obj.v_pos
// obj.v_vertex[_].uv is a reference to an element of obj.v_uv
// This causes a lot of problems; it's effectively impossible to use
struct Object {
v_pos: Vec<Position>,
v_uv: Vec<Uv>,
v_vertex: Vec<Vertex>
}
...Which threw me down deep into the rabbit holes of self-referential structs and why they cannot exist in safe Rust. After learning more about, it turns out that as I suspected, the original implementation hid a lot of unsafety pitfalls that were caught by the compiler when I started being more explicit.
I'm aware of the existence unsafe solutions like Pin
, but I feel like at that point I might as well stick to the original method.
This leads me to the core question: Is there an idiomatic way of representing this relationship? I want to be able to modify the contents of each of the Vec
s in a compiler-checked manner.
Is there an idiomatic way of representing this relationship?
The usize
s you started with are the idiomatic way of representing this relationship.
Any operations to the attribute buffers that modifies existing data is going to be primarily concerned with making sure the index buffer isn't invalidated. …
Yes; you should write those operations within the module that defines Object
, and keep the fields private so the Object
cannot become inconsistent as long as those operations are correctly defined.
In short, it leaves most of the promises the rust compiler is capable of making on the table.
It doesn't — because the Rust compiler is not actually capable of making those promises. &
and even &mut
references are actually very limited — they work by statically enforcing “Nobody (else) is going to change this value while you have the reference”. They don't have any bigger picture than that. In your case, assuming you're planning to edit this data, you will need to do operations that modify multiple parts in a consistent fashion, like “add a Position and also a Vertex that uses it”, or maybe “simultaneously add 3 vertices making up a triangle, using these 3 existing Positions”. References cannot help you do this correctly.
The only kind of data structure of this sort that you can in fact build using references is an append-only one, using the help of, for example, typed-arena
. This might be suitable for an algorithm which is building a mesh. However, given that it's append-only, there is very little benefit — the operation “append a vertex, choosing indices as you go” is easy to write correctly without references. Additionally, you won't be able to store the mesh constructed that way long-term (because it is made of vectors that borrow from the arena) unless you also throw in ouroboros
to wrap up the self-reference.
Fundamentally, references are designed to be used as temporary things — as a formalization and enforcement of common patterns used in C and C++ when passing and returning pointers — hence also being called “borrows”. The rules which the compiler understands about references are rules designed to handle those temporary uses. They are almost never what you should be building a data structure out of.