I'm trying to make a thread-safe data type, with locking full encapsulated so that callers never see or directly hold a lock. (This part works fine.)
Further, there is a requirement that the struct which generates an instance of this data type hold a mutable reference and gives out a mutable reference to the caller. (This is what I'm having trouble with.)
Here is example code that pretty well distills down the problem:
use std::sync::{Arc, RwLock};
use std::cell::RefCell;
trait ChangeName {
fn change_name(&mut self, name: String);
}
// MyDataPrivate contains the data I care about.
#[derive(Debug)]
struct MyDataPrivate {
name: String,
}
impl ChangeName for MyDataPrivate {
fn change_name(&mut self, name: String) {
self.name = name;
}
}
// MyData encapsulates locking around MyData.
// Callers never hold a lock directly.
#[derive(Debug)]
pub struct MyData {
inner: RwLock<MyDataPrivate>,
}
impl ChangeName for MyData {
fn change_name(&mut self, name: String) {
self.inner.write().unwrap().change_name(name);
}
}
impl MyData {
fn new(name: String) -> Self {
Self {
inner: RwLock::new(MyDataPrivate{name})
}
}
}
// MyDataRef provides mutable references to MyData.
#[derive(Debug, Clone)]
pub struct MyDataRef(Arc<RefCell<MyData>>);
impl ChangeName for MyDataRef {
fn change_name(&mut self, name: String) {
self.0.borrow_mut().change_name(name);
}
}
// This makes it build. Is it Safe/Ok since MyData is Send + Sync?
unsafe impl Sync for MyDataRef {}
unsafe impl Send for MyDataRef {}
fn do_stuff<T: Sync + Send>(_t: T) {}
fn main() {
let mydata = MyData::new("Donner".to_string());
let mut mydata_ref = MyDataRef(Arc::new(RefCell::new(mydata)));
do_stuff(mydata_ref.clone());
mydata_ref.change_name("Rudolf".to_string());
println!("name: {:?}", mydata_ref);
}
See Playground.
The above code will not compile without using unsafe
to impl Send + Sync for MyDataRef
. This is because RefCell
is not thread-safe.
However, the data I put inside the RefCell
is thread-safe (protected by RwLock
). So my belief is that makes this particular RefCell
thread-safe also.
I realize I could use Arc<RwLock<MyData>>
in MyDataRef
instead of Arc<RefCell<MyData>>
but then I have two RwLock
s around MyDataPrivate
which seems unnecessary and gross.
My question is two parts:
impl Send + Sync
for RefCell
in this situation where the data it contains is 100% RwLock
protected?2) Is there a better way to do this, that still keeps the locking encapsulated away from callers?
edit: I figured out a nice solution.
ps: in the original code MyDataRef
is actually a type alias, eg:
pub type MyDataRef = Arc<RefCell<MyData>>;
I changed to MyDataRef
to newtype for this example so I could unsafe impl Send + Sync
on it.
As written, I would not call this sound.
In Rust, it's expected that undefined behavior can only happen within unsafe
blocks. This impl
violates that expectation:
impl ChangeName for MyDataRef {
fn change_name(&mut self, name: String) {
self.0.borrow_mut().change_name(name);
}
}
This function must not be called from multiple threads at the same time, but there is no static guarantee that it won't be. If you want to shift that burden to the caller, this function must be marked unsafe
and the safety restriction documented. You can't do this without marking the trait method itself unsafe
and then all implementations would need to be unsafe
even if they otherwise wouldn't need to be.
I don't think your example code invokes any undefined behavior since you take an exclusive lock before trying to borrow from the RefCell
, but there really isn't a compelling reason why you need to structure your data this way in the first place.
What you have here is effectively RwLock<Arc<RefCell<_>>>
and this nesting makes absolutely no sense. You have non-thread-safe interior mutability inside of thread-safe shared ownership inside of thread-safe interior mutability. It's mixing together stuff with conflicting thread-safe-ness and in an order that isn't useful. (You can tell that it doesn't make sense precisely because you're having to invoke unsafe
in order to do what you want! If the data model is causing a huge amount of friction then it's probably not the right model.)
RwLock<Arc<T>>
can be useful when you want the ability to replace an Arc
with another Arc
1, but is not useful when you want to change the T
within the Arc
.
Arc<RefCell<T>>
makes no sense at all as you're wrapping a non-thread-safe interior mutability type in a thread-safe shared ownership type.
Instead, use:
Rc<RefCell<_>>
if you need single-threaded shared ownership with interior mutability, orArc<Mutex<_>>
or Arc<RwLock<_>>
if you need multi-threaded shared ownership with interior mutability.
Arc<Mutex<_>>
makes sense when you always or almost always need exclusive access.Arc<RwLock<_>>
makes sense when you almost always need shared access, but occasionally need exclusive access. Note in particular that std::sync::RwLock
is not guaranteed to be "fair" which means that a sufficiently-high amount of read activity can theoretically starve writers indefinitely.1 As evidence that this can be useful, the arc-swap
crate implements conceptually the same thing as RwLock<Arc<T>>
but with better performance. Sometimes you have a shared data structure that you want the ability to update atomically, but without interrupting other threads that are still using the old value. This ownership model allows a kind of "snapshotting" where threads/tasks will see either the old data or the new data, but never a mixture of the two.