rustborrow-checkerborrow

How to create factory that dynamically creates values and returns borrows to them?


I'd like to have a struct called Factory that dynamically produces new Strings, keeps them inside itself and returns &str borrows of them that live as long as the Factory value itself.

I tried to keep new values inside in a Vec but as Vec grows borrows to elements would get invalidated so they don't live long enough. I tried wrapping them in Boxes, RefCells but I encounter same problems.

I would also like to call this factory method inside a loop so I can make new String in each iteration and get a borrow of it to keep somewhere.


There's a crate called string-interner: https://docs.rs/string-interner/latest/string_interner/index.html

It might be a good idea to use it either directly or through similar structs as below if what you are after are just String handles.


That's what I've got so far thanks to your comments:

use std::{ cell::{Ref, RefCell}, rc::Rc, }; 

struct StringHandle {
    key: usize,
    store: Rc<RefCell<Vec<String>>>,
}
impl StringHandle {
    pub fn get(&self) -> Ref<String> {
        Ref::map(self.store.borrow(), |v| &v[self.key])
    }
}

struct Factory {
    pub store: Rc<RefCell<Vec<String>>>,
}
impl Factory {
    pub fn make_next_string(&mut self) -> StringHandle {
        let len = self.store.borrow().len();
        self.store.borrow_mut().push(format!("string no. {}", len));
        StringHandle {
            store: self.store.clone(),
            key: len,
        }
    }
    pub fn new() -> Factory {
        Factory { store: Rc::new(RefCell::new(vec![])) }
    }
}

let mut f = Factory::new();
let mut strs: Vec<StringHandle> = vec![];

for _ in 0..5 {
    let handle = f.make_next_string();
    strs.push(handle);
}

for handle in strs {
    println!("{}", handle.get());
}

And generic version for structs other than String:

use std::{ cell::{Ref, RefCell, RefMut}, rc::Rc, }; 

struct Handle<T> {
    key: usize,
    store: Rc<RefCell<Vec<T>>>,
}
impl<T> Handle<T> {
    pub fn get(&self) -> Ref<T> {
        Ref::map(self.store.borrow(), |v| &v[self.key])
    }
    pub fn get_mut(&self) -> RefMut<T> {
        RefMut::map(self.store.borrow_mut(), |v| &mut v[self.key])
    }
}

struct Factory<T> {
    pub store: Rc<RefCell<Vec<T>>>,
}
impl<T: Default> Factory<T> {
    pub fn make_next(&mut self) -> Handle<T> {
        let len = self.store.borrow().len();
        self.store.borrow_mut().push(T::default());
        Handle {
            store: self.store.clone(),
            key: len,
        }
    }
    pub fn new() -> Factory<T> {
        Factory { store: Rc::new(RefCell::new(vec![])) }
    }

}

#[derive(Debug)]
struct Data {
    pub number: i32
}

impl Default for Data {
    fn default() -> Self {
        Data { number: 0 }
    }
}

let mut objs: Vec<Handle<Data>> = vec![];
let mut f: Factory<Data> = Factory::new();



for i in 0..5 {
    let handle = f.make_next();

    handle.get_mut().number = i;

    objs.push(handle);
}

for handle in objs {
    println!("{:?}", handle.get());
}

Solution

  • First, if you have a &mut access to the interner, you don't need RefCell on it. But you likely want to access it through shared references so you do need.

    Another way is to return a newtyped index into the Vec instead of references. This saves the indirection, but requires an access to the interner to access the interned string, so it may not fulfill the requirements. This also does not allow you to allocate new strings while you keep references to the old around (using RefCell will not help, it will just panic):

    use std::ops::Index;
    
    struct StringHandle(usize);
    
    struct Factory {
        pub store: Vec<String>,
    }
    impl Factory {
        pub fn make_next_string(&mut self) -> StringHandle {
            let len = self.store.len();
            self.store.push(format!("string no. {}", len));
            StringHandle(len)
        }
        pub fn new() -> Factory {
            Factory { store: vec![] }
        }
    }
    
    impl Index<StringHandle> for Factory {
        type Output = str;
        fn index(&self, index: StringHandle) -> &Self::Output {
            &self.store[index.0]
        }
    }
    
    fn main() {
        let mut f = Factory::new();
        let mut strs: Vec<StringHandle> = vec![];
    
        for _ in 0..5 {
            let handle = f.make_next_string();
            strs.push(handle);
        }
    
        for handle in strs {
            println!("{}", &f[handle]);
        }
    }
    

    The best way is to use an arena. It allows you to yield references (and therefore does not require access to the interner to access interned strings), and keep the old around while making new. The disadvantages are that it requires using a crate, as you probably don't want to implement the arena yourself (this also requires unsafe code), and that you can't store that interner alongside the interned strings (this is a self-referential struct). You can use the typed-arena crate for that:

    use std::cell::Cell;
    
    use typed_arena::Arena;
    
    struct Factory {
        store: Arena<String>,
        len: Cell<u32>,
    }
    
    impl Factory {
        pub fn make_next_string(&self) -> &str {
            let len = self.len.get();
            self.len.set(len + 1);
            self.store.alloc(format!("string no. {}", len))
        }
        pub fn new() -> Factory {
            Factory { store: Arena::new(), len: Cell::new(0) }
        }
    }
    
    fn main() {
        let f = Factory::new();
        let mut strs: Vec<&str> = vec![];
    
        for _ in 0..5 {
            let interned = f.make_next_string();
            strs.push(interned);
        }
    
        for interned in strs {
            println!("{}", interned);
        }
    }
    

    You can also store strs inside the arean (instead of Strings) The advantages are better cache access as the structure is more flat and much faster drop of the interner itself due to not needing to loop over and drop the stored strings; the disadvantage is that you need to copy the strings before you store them. I recommend bumpalo:

    use std::cell::Cell;
    
    use bumpalo::Bump;
    
    struct Factory {
        store: Bump,
        len: Cell<u32>,
    }
    
    impl Factory {
        pub fn make_next_string(&self) -> &str {
            let len = self.len.get();
            self.len.set(len + 1);
            self.store.alloc_str(&format!("string no. {}", len))
        }
        pub fn new() -> Factory {
            Factory { store: Bump::new(), len: Cell::new(0) }
        }
    }
    
    fn main() {
        let f = Factory::new();
        let mut strs: Vec<&str> = vec![];
    
        for _ in 0..5 {
            let interned = f.make_next_string();
            strs.push(interned);
        }
    
        for interned in strs {
            println!("{}", interned);
        }
    }