I'd like to have a struct
called Factory that dynamically produces new Strings
, keeps them inside itself and returns &str
borrows of them that live as long as the Factory value itself.
I tried to keep new values inside in a Vec
but as Vec
grows borrows to elements would get invalidated so they don't live long enough. I tried wrapping them in Boxes
, RefCells
but I encounter same problems.
I would also like to call this factory method inside a loop so I can make new String in each iteration and get a borrow of it to keep somewhere.
There's a crate called string-interner: https://docs.rs/string-interner/latest/string_interner/index.html
It might be a good idea to use it either directly or through similar structs as below if what you are after are just String handles.
That's what I've got so far thanks to your comments:
use std::{ cell::{Ref, RefCell}, rc::Rc, };
struct StringHandle {
key: usize,
store: Rc<RefCell<Vec<String>>>,
}
impl StringHandle {
pub fn get(&self) -> Ref<String> {
Ref::map(self.store.borrow(), |v| &v[self.key])
}
}
struct Factory {
pub store: Rc<RefCell<Vec<String>>>,
}
impl Factory {
pub fn make_next_string(&mut self) -> StringHandle {
let len = self.store.borrow().len();
self.store.borrow_mut().push(format!("string no. {}", len));
StringHandle {
store: self.store.clone(),
key: len,
}
}
pub fn new() -> Factory {
Factory { store: Rc::new(RefCell::new(vec![])) }
}
}
let mut f = Factory::new();
let mut strs: Vec<StringHandle> = vec![];
for _ in 0..5 {
let handle = f.make_next_string();
strs.push(handle);
}
for handle in strs {
println!("{}", handle.get());
}
And generic version for structs other than String:
use std::{ cell::{Ref, RefCell, RefMut}, rc::Rc, };
struct Handle<T> {
key: usize,
store: Rc<RefCell<Vec<T>>>,
}
impl<T> Handle<T> {
pub fn get(&self) -> Ref<T> {
Ref::map(self.store.borrow(), |v| &v[self.key])
}
pub fn get_mut(&self) -> RefMut<T> {
RefMut::map(self.store.borrow_mut(), |v| &mut v[self.key])
}
}
struct Factory<T> {
pub store: Rc<RefCell<Vec<T>>>,
}
impl<T: Default> Factory<T> {
pub fn make_next(&mut self) -> Handle<T> {
let len = self.store.borrow().len();
self.store.borrow_mut().push(T::default());
Handle {
store: self.store.clone(),
key: len,
}
}
pub fn new() -> Factory<T> {
Factory { store: Rc::new(RefCell::new(vec![])) }
}
}
#[derive(Debug)]
struct Data {
pub number: i32
}
impl Default for Data {
fn default() -> Self {
Data { number: 0 }
}
}
let mut objs: Vec<Handle<Data>> = vec![];
let mut f: Factory<Data> = Factory::new();
for i in 0..5 {
let handle = f.make_next();
handle.get_mut().number = i;
objs.push(handle);
}
for handle in objs {
println!("{:?}", handle.get());
}
First, if you have a &mut
access to the interner, you don't need RefCell
on it. But you likely want to access it through shared references so you do need.
Another way is to return a newtyped index into the Vec
instead of references. This saves the indirection, but requires an access to the interner to access the interned string, so it may not fulfill the requirements. This also does not allow you to allocate new strings while you keep references to the old around (using RefCell
will not help, it will just panic):
use std::ops::Index;
struct StringHandle(usize);
struct Factory {
pub store: Vec<String>,
}
impl Factory {
pub fn make_next_string(&mut self) -> StringHandle {
let len = self.store.len();
self.store.push(format!("string no. {}", len));
StringHandle(len)
}
pub fn new() -> Factory {
Factory { store: vec![] }
}
}
impl Index<StringHandle> for Factory {
type Output = str;
fn index(&self, index: StringHandle) -> &Self::Output {
&self.store[index.0]
}
}
fn main() {
let mut f = Factory::new();
let mut strs: Vec<StringHandle> = vec![];
for _ in 0..5 {
let handle = f.make_next_string();
strs.push(handle);
}
for handle in strs {
println!("{}", &f[handle]);
}
}
The best way is to use an arena. It allows you to yield references (and therefore does not require access to the interner to access interned strings), and keep the old around while making new. The disadvantages are that it requires using a crate, as you probably don't want to implement the arena yourself (this also requires unsafe code), and that you can't store that interner alongside the interned strings (this is a self-referential struct). You can use the typed-arena
crate for that:
use std::cell::Cell;
use typed_arena::Arena;
struct Factory {
store: Arena<String>,
len: Cell<u32>,
}
impl Factory {
pub fn make_next_string(&self) -> &str {
let len = self.len.get();
self.len.set(len + 1);
self.store.alloc(format!("string no. {}", len))
}
pub fn new() -> Factory {
Factory { store: Arena::new(), len: Cell::new(0) }
}
}
fn main() {
let f = Factory::new();
let mut strs: Vec<&str> = vec![];
for _ in 0..5 {
let interned = f.make_next_string();
strs.push(interned);
}
for interned in strs {
println!("{}", interned);
}
}
You can also store str
s inside the arean (instead of String
s) The advantages are better cache access as the structure is more flat and much faster drop of the interner itself due to not needing to loop over and drop the stored strings; the disadvantage is that you need to copy the strings before you store them. I recommend bumpalo
:
use std::cell::Cell;
use bumpalo::Bump;
struct Factory {
store: Bump,
len: Cell<u32>,
}
impl Factory {
pub fn make_next_string(&self) -> &str {
let len = self.len.get();
self.len.set(len + 1);
self.store.alloc_str(&format!("string no. {}", len))
}
pub fn new() -> Factory {
Factory { store: Bump::new(), len: Cell::new(0) }
}
}
fn main() {
let f = Factory::new();
let mut strs: Vec<&str> = vec![];
for _ in 0..5 {
let interned = f.make_next_string();
strs.push(interned);
}
for interned in strs {
println!("{}", interned);
}
}