multithreadingrustrefcellrwlock

Is it safe to make RefCell Sync+Send if the data it contains is protected by RwLock?


I'm trying to make a thread-safe data type, with locking full encapsulated so that callers never see or directly hold a lock. (This part works fine.)

Further, there is a requirement that the struct which generates an instance of this data type hold a mutable reference and gives out a mutable reference to the caller. (This is what I'm having trouble with.)

Here is example code that pretty well distills down the problem:

use std::sync::{Arc, RwLock};
use std::cell::RefCell;

trait ChangeName {
    fn change_name(&mut self, name: String);
}

// MyDataPrivate contains the data I care about.
#[derive(Debug)]
struct MyDataPrivate {
    name: String,
}
impl ChangeName for MyDataPrivate {
    fn change_name(&mut self, name: String) {
        self.name = name;
    }
}

// MyData encapsulates locking around MyData.
// Callers never hold a lock directly.
#[derive(Debug)]
pub struct MyData {
    inner: RwLock<MyDataPrivate>,
}
impl ChangeName for MyData {
    fn change_name(&mut self, name: String) {
        self.inner.write().unwrap().change_name(name);
    }
}

impl MyData {
    fn new(name: String) -> Self {
        Self {
            inner: RwLock::new(MyDataPrivate{name})
        }
    }
}

// MyDataRef provides mutable references to MyData.
#[derive(Debug, Clone)]
pub struct MyDataRef(Arc<RefCell<MyData>>);
impl ChangeName for MyDataRef {
    fn change_name(&mut self, name: String) {
        self.0.borrow_mut().change_name(name);
    }
}

// This makes it build.  Is it Safe/Ok since MyData is Send + Sync?
unsafe impl Sync for MyDataRef {}
unsafe impl Send for MyDataRef {}


fn do_stuff<T: Sync + Send>(_t: T) {}


fn main() {
    let mydata = MyData::new("Donner".to_string());
    let mut mydata_ref = MyDataRef(Arc::new(RefCell::new(mydata)));
    
    do_stuff(mydata_ref.clone());
    
    mydata_ref.change_name("Rudolf".to_string());
    
    println!("name: {:?}", mydata_ref);
}

See Playground.

The above code will not compile without using unsafe to impl Send + Sync for MyDataRef. This is because RefCell is not thread-safe.

However, the data I put inside the RefCell is thread-safe (protected by RwLock). So my belief is that makes this particular RefCell thread-safe also.

I realize I could use Arc<RwLock<MyData>> in MyDataRef instead of Arc<RefCell<MyData>> but then I have two RwLocks around MyDataPrivate which seems unnecessary and gross.

My question is two parts:

  1. Is it safe/correct to impl Send + Sync for RefCell in this situation where the data it contains is 100% RwLock protected?

2) Is there a better way to do this, that still keeps the locking encapsulated away from callers?
edit: I figured out a nice solution.

ps: in the original code MyDataRef is actually a type alias, eg:

pub type MyDataRef = Arc<RefCell<MyData>>;

I changed to MyDataRef to newtype for this example so I could unsafe impl Send + Sync on it.


Solution

  • As written, I would not call this sound.

    In Rust, it's expected that undefined behavior can only happen within unsafe blocks. This impl violates that expectation:

    impl ChangeName for MyDataRef {
        fn change_name(&mut self, name: String) {
            self.0.borrow_mut().change_name(name);
        }
    }
    

    This function must not be called from multiple threads at the same time, but there is no static guarantee that it won't be. If you want to shift that burden to the caller, this function must be marked unsafe and the safety restriction documented. You can't do this without marking the trait method itself unsafe and then all implementations would need to be unsafe even if they otherwise wouldn't need to be.

    I don't think your example code invokes any undefined behavior since you take an exclusive lock before trying to borrow from the RefCell, but there really isn't a compelling reason why you need to structure your data this way in the first place.

    What you have here is effectively RwLock<Arc<RefCell<_>>> and this nesting makes absolutely no sense. You have non-thread-safe interior mutability inside of thread-safe shared ownership inside of thread-safe interior mutability. It's mixing together stuff with conflicting thread-safe-ness and in an order that isn't useful. (You can tell that it doesn't make sense precisely because you're having to invoke unsafe in order to do what you want! If the data model is causing a huge amount of friction then it's probably not the right model.)

    RwLock<Arc<T>> can be useful when you want the ability to replace an Arc with another Arc1, but is not useful when you want to change the T within the Arc.

    Arc<RefCell<T>> makes no sense at all as you're wrapping a non-thread-safe interior mutability type in a thread-safe shared ownership type.

    Instead, use:


    1 As evidence that this can be useful, the arc-swap crate implements conceptually the same thing as RwLock<Arc<T>> but with better performance. Sometimes you have a shared data structure that you want the ability to update atomically, but without interrupting other threads that are still using the old value. This ownership model allows a kind of "snapshotting" where threads/tasks will see either the old data or the new data, but never a mixture of the two.