multithreadingrustthreadpoolrayon

Rust Rayon ThreadPool: 'Cannot borrow as mutable, as it is a captured variable in a Fn closure'


I am trying to learn Rust's Rayon library by making a simple vector addition function. My current code is this, assuming a, b and c are vectors initialized to the same length, c is mutable and num_threads is a usize variable:

let pool = ThreadPoolBuilder::new().num_threads(num_threads).build()
           .expect("Could not create thread pool");
pool.install(|| {
    (0..c.len()).into_par_iter().for_each(|x| {
        c[x] = a[x] + b[x];
    });
});

But I get the error

error[E0596]: cannot borrow c as mutable, as it is a captured variable in a Fn closure
    c[x] = a[x].wrap_add(b[x]);
    ^ cannot borrow as mutable

I also want to note that ultimately I aim to make benchmark software, which is why I'm using thread pooling to specify the number of threads, but I think the extra closure pool.install() uses is part of where the issue is coming from. Modifying the global thread pool also isn't an option since that can only be done once, and I will want to re-run benchmarks with different thread numbers. I would also like to avoid using scope - if that's the only solution, so be it - since that would add a performance penalty.

I understand fundamentally what Rayon doesn't like here: chapter 4.2 of the Rust Book says that you can't have multiple mutable references to a variable, which is essentially what each thread would be getting. However, aside from the particular question of how I could get this code to work, this raises some other questions.

I can't seem to even move a reference to c into the thread pool closure. Why is this a restriction? Surely the power of multithreading is to get multiple threads to concurrently do some work on related data, so why can't I pass data to the threads this way?

Assuming I could get a reference to c into the thread pool, would there be some way to make it so that, for instance, thread 0 gets &mut c[0], thread 1 gets &mut c[1], and so on? If Rayon's purpose is to abstract some of the boilerplate of Rust's basic multithreading library, shouldn't this be something Rayon tries to make simpler?

Some other answers I have seen imply that iterating over the contents of the vectors themselves would help, but since I need all three vectors I would need to use izip. Doing this (replacing (0..c.len().into_par_iter()... with izip!(&a.mat, &b.mat, &mut c.mat).into_par_iter()...) gives me the error

error[E0599]: the method into_par_iter exists for struct Map<Zip<Zip<Iter<'_, T>, Iter<'_, T>>, IterMut<'_, T>>, {closure@lib.rs:303:9}>, but its trait bounds were not satisfied

izip!(&a.mat, &b.mat, &mut c.mat).into_par_iter().for_each(|x| {
   |                                                   ^^^^^^^^^^^^^
   |
   = note: the following trait bounds were not satisfied:
           `std::iter::Map<std::iter::Zip<std::iter::Zip<std::slice::Iter<'_, T>, std::slice::Iter<'_, T>>, std::slice::IterMut<'_, T>>, {closure@/home/richard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/itertools-0.12.1/src/lib.rs:303:9: 303:10}>: rayon::iter::ParallelIterator`
           which is required by `std::iter::Map<std::iter::Zip<std::iter::Zip<std::slice::Iter<'_, T>, std::slice::Iter<'_, T>>, std::slice::IterMut<'_, T>>, {closure@/home/richard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/itertools-0.12.1/src/lib.rs:303:9: 303:10}>: rayon::iter::IntoParallelIterator`
           `&std::iter::Map<std::iter::Zip<std::iter::Zip<std::slice::Iter<'_, T>, std::slice::Iter<'_, T>>, std::slice::IterMut<'_, T>>, {closure@/home/richard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/itertools-0.12.1/src/lib.rs:303:9: 303:10}>: rayon::iter::ParallelIterator`
           which is required by `&std::iter::Map<std::iter::Zip<std::iter::Zip<std::slice::Iter<'_, T>, std::slice::Iter<'_, T>>, std::slice::IterMut<'_, T>>, {closure@/home/richard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/itertools-0.12.1/src/lib.rs:303:9: 303:10}>: rayon::iter::IntoParallelIterator`
           `&mut std::iter::Map<std::iter::Zip<std::iter::Zip<std::slice::Iter<'_, T>, std::slice::Iter<'_, T>>, std::slice::IterMut<'_, T>>, {closure@/home/richard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/itertools-0.12.1/src/lib.rs:303:9: 303:10}>: rayon::iter::ParallelIterator`
           which is required by `&mut std::iter::Map<std::iter::Zip<std::iter::Zip<std::slice::Iter<'_, T>, std::slice::Iter<'_, T>>, std::slice::IterMut<'_, T>>, {closure@/home/richard/.cargo/registry/src/index.crates.io-6f17d22bba15001f/itertools-0.12.1/src/lib.rs:303:9: 303:10}>: rayon::iter::IntoParallelIterator`

which seems to imply that references are now being moved into the closure. What gives? Why were they not being moved before?


Solution

  • The first thing you should do is write your code as a normal iterator, strictly with a chain of iterator methods.

    c.iter_mut()
        .zip(&a)
        .zip(&b)
        .for_each(|((c, &a), &b)| *c = a + b);
    

    This can be converted directly to a multithreaded version by adding par_, no other changes necessary. This is the ideal scenario for Rayon.

    pool.install(|| {
        c.par_iter_mut()
            .zip(&a)
            .zip(&b)
            .for_each(|((c, &a), &b)| *c = a + b);
    });
    

    And Rayon provides implementations of IntoParallelIterator for tuples, which allows you to make an equivalent version that's even simpler.

    pool.install(|| {
        (&mut c, &a, &b)
            .into_par_iter()
            .for_each(|(c, &a, &b)| *c = a + b);
    });
    

    Note that not all situations will be so simple. In particular, if you need to have only one of something per thread, look at the _init or _with methods. Iterators that need to accumulate into a single value will often need to use fold and reduce. Be sure to read through the methods of ParallelIterator and IndexedParallelIterator to find what works best for your situation.