I am trying to implement an algorithm which requires the use of a second mutable vector (elements are read only) containing a subset of elements of the first vector. That subset is of different order and will be reduced gradually, so slicing is not an option.
Basically I would require a shallow copy of the vector which I can mutate.
The original vector is passed to a function. Neither the original vector nor its elements are mutated. Single threaded.
The orinal vector is points: Vec<Point>
, which is ideally passed as &points
to the function.
Performance is of the utmost importance.
These solution concepts come to my mind:
new Vector with cloned elements: Vec<Point>
I can create a Vec with all points cloned and use that.
Penalty: High memory usage, time to deep copy all vectors. I also need to maintain a reference to the original point.
Algoritm runtime penalty: 0%
new Vector with reference to original element: Vec<&Point>
I can create a Vec<&Point> and use that.
Penalty: low memory usage, time to copy the references into a new vector
Also: need to provide each sub-function twice: one for &[Point] and one for &[&Point]
Algoritm runtime penalty: 25% (probably because reference now need two deref)
Using Rc for each Point
This requires original function to pass a vector points: Vec<Rc<Point>>
.
Penalty: low memory usage, time to create the Rc Vector
Algoritm runtime penalty: 250% (my test function does not even create a second reference, this is just reading the original one)
Also: I have not found a way to create this inside the function without cloning each Point (Rc::new(point.clone()). As such, this does not make any sense, since I can work with a cloned vector much faster.
Interpretation:
Question: Is there another possibility which I do not see, with or without unsafe rust?
A plain shallow copy which does not move would be ideal for this case. Since I am dropping the shallow copy when the function is left, there would be no dangling pointers. Maybe I am missing another risk.
Relevant pieces of code
pub trait PointLike {
fn get_id(&self) -> usize;
fn get_x(&self) -> isize;
fn get_y(&self) -> isize;
}
fn main() {
// Random
let points = random_points(NUM_POINTS, RANDOM_MIN, RANDOM_MAX);
let _dummy = convex_hull(&points);
let _dummy = convex_hull_ref(&points);
let _dummy = convex_hull_clone(&points);
let mut points_rc = Vec::new();
for point in &points {
points_rc.push(Rc::new(point.clone()));
}
let _dummy = convex_hull_rc(&points_rc);
}
// this function is implemented multiple times, only change is the used vector type
fn convex_hull<T: PointLike>(points: &[T]) -> Vec<&T> {
assert!(points.len() >= 50);
let start = Instant::now();
let n = points.len();
let h = 50;
let step = n / h;
let mut convex_hull: Vec<&T> = Vec::with_capacity(step);
// emulate gift wrapping to get some time to measure
let mut search_id = step;
for _ in 0..h {
let mut ch_idx = 0;
for i in 0..n {
if points[i].get_id() == search_id {
ch_idx = i;
}
}
convex_hull.push(&points[ch_idx]);
search_id += step;
}
let duration_micro = start.elapsed().as_micros();
println!(
" duration convex_hull for n = {}, convex hull size = {}: {} microseconds",
points.len(),
convex_hull.len(),
duration_micro,
);
convex_hull
}
Basically I would require a shallow copy of the vector which I can mutate.
This makes me think about copy-on-write.
I'm not certain this is what you are looking for, but in case it is, here is an example.
The whole set of points stored inside the original vector is not duplicated in the (shallow) copy as long as we just read these points.
But as soon as we need to mutate some of these points in the copy, or to reorder this storage, a new storage is required in order to keep the original points unchanged.
Note that, because all the points are contiguously stored in a dynamic array, we cannot only clone+mutate one point, leaving the others untouched.
For this we would need a vector of pointers to point, but this implies one more indirection at each access (and probably memory fragmentation).
#[derive(Debug, Clone, PartialEq)]
struct Point {
x: isize,
y: isize,
}
fn main() {
let original: Vec<_> =
(1..=10).map(|i| Point { x: i, y: 10 * i }).collect();
let mut copy = std::borrow::Cow::from(&original);
println!(
"before mutation, same content? ~~> {}",
original.iter().eq(copy.iter())
);
println!(
"before mutation, same storage? ~~> {}",
std::ptr::eq(&original[0], ©[0])
);
let half_size = copy.len() / 2;
copy.to_mut().truncate(half_size);
println!(
"after mutation, same content? ~~> {}",
original.iter().eq(copy.iter())
);
println!(
"after mutation, same storage? ~~> {}",
std::ptr::eq(&original[0], ©[0])
);
}
/*
before mutation, same content? ~~> true
before mutation, same storage? ~~> true
after mutation, same content? ~~> false
after mutation, same storage? ~~> false
*/